Not exactly crashing. Somehow we're calling sfp_module_start but it's unhappy because it's already in state RUNNING (and vice versa for shutdown). Seems like it may be because phylink is confused: there's the SerDes phy and the sfp phy, and they're not both running or stopping at the same time. ethtool output on master matches the SFP+ device, but the link status and media/port are wrong. I have 10GBase-SR optical and DACs around I can also check.
toggle the GPIO to disable transmission
I think your commented guess in the DTS is likely correct, so I'll try that first. Avoiding -wip for now because I spent a few hours trying to get i2c working with no success, but it just works on master. There are no changes to the driver code, so it must be something in the DTS: notable changes are multiple DTSI inclusions and the @address notation, somehow there are no IOMEM resources (or any other kind) attached to pdev there. I also notice the reg values (the first is an address, not sure what the second value is....the notations say we have 1 address cell and 0 size cells) don't match what worked (0x1c vs 0x3c), but neither value works on -wip.
I think those are completely ignored by the driver anyway and it's all hard-coded. deff not hard coded but found some other minor things, that I'm pretty much through fixing. I'm also dropping the mux driver in favor of the i2c-mux-reg driver. I'll see what happens on my switch ...
That was a brilliant hint @andyboeh I flashed 21.02.5 and there reboot is still working properly:
[ 415.141757] reboot: Restarting system
[ 415.145971] System restart.
[ 415.149169] PLL control register: efffffff, applying reset value efffffff
U-Boot 2011.12.(2.1.5.67086)-Candidate1 (Jul 23 2020 - 13:01:19)
...
The reboot functionality hence seems to have been broken between the 21.02.x and 22.03.x branch splits from master.
That would mean, there was no reboot gpio before and after addition it broke reboot functionality?
Then it should either be removed again for that device or we need to find the correct gpio.
I am not that low level. hence I would need some guide on how to export all these gpios and then how to toggle them.
Exactly. There are different reset methods and if gpio-restart is not defined, it defaults to another method (that's probably this PLL control register message you're seeing). I'm not an expert on this, that's just what I experienced with another switch.
I have tried the script from your link and tested all 159 gpio with script.sh <number> out 0 and script.sh <number> out 1
It said 24 is not available and aswell for all numbers higher 159.
However the switch did not reboot. Only the port LEDs changed color.
I am still on 21.02.5 at the moment. Is that the problem? But then I wonder, why the LEDs changed?
I figured out why i2c is broken, the i2c devices are defined at their memory offsets inside the soc block, this makes the resources invalid because those addresses are outside the range associated with the soc block. However, fixing this does allow detection (and fixes the resource), I'm getting garbage from i2cdetect, so I may try reversing out the recent i2c patches.
With the most recent two commits (i2c related) reverted, this patch detects SFPs correctly:
(See next post, patch is better and works)
[ 11.380652] sfp sfp-p0: Host maximum power 1.0W
[ 11.391823] sfp sfp-p8: Host maximum power 1.0W
[ 11.401762] sfp sfp-p16: Host maximum power 1.0W
[ 11.411958] sfp sfp-p20: Host maximum power 1.0W
[ 11.422341] sfp sfp-p24: Host maximum power 1.0W
[ 11.432811] sfp sfp-p25: Host maximum power 1.0W
[ 11.443044] sfp sfp-p26: Host maximum power 1.0W
[ 11.453231] sfp sfp-p27: Host maximum power 1.0W
[ 11.730705] sfp sfp-p0: module OEM SFP-10G-SR rev 02 sn CSF101L80337 dc 210805
[ 224.647161] sfp sfp-p0: module OEM SFP-10G-SR rev 02 sn CSF101L80337 dc 210805
but leads to
[ 238.894814] WARNING: CPU: 1 PID: 1041 at drivers/net/phy/phy.c:1110 phy_start+0xc0/0xe0
[ 238.903856] called from state NOLINK
Still seems like progress for -wip.
ethtool output:
root@OpenWrt:/# ethtool lan8
Settings for lan8:
Supported ports: [ FIBRE ]
Supported link modes: 10000baseSR/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: Not reported
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Unknown! (255)
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 0
Transceiver: external
MDI-X: Unknown
Supports Wake-on: d
Wake-on: d
Link detected: no
I have it working, including SFP detection! This patch also requires git reset --hard 8c1f0dda450e0c3765e988aa9590fbafd4f08490, the recent i2c changes weren't working for me I think.
SFP port i2c connections are backward/wrong in rtl9303_common_8s.dts
PCA9555s are on the wrong i2c bus
Both the CPU port and phy can be set to 10000 on this device, instead of 1000
i2c reg to match i2c channel numbers (1-based... )
Things that may be hacks
I added the "fixed-link" wrapper from Birger's 10g_sfp branch, I'll try without later
Misc bugs fixed
Accented O in FOUND A SERDES printf
Old GPIO expander name in rtl9303_tp-link_tl-st1008f_v2.0.dts
The i2c-inside-soc-block issue from previous post. This might be fixable by leaving them there with corrected offsets, not sure
Really truly annoying pr_infos in rtl9300_read_status --> pr_debug
Known bugs!
MAC addresses seem fake on both USW-Aggregation and XGS1250 (on OEM FW!): my XGS1250 and this device are both claiming 00:e0:4c:00:00:00 and USW-Aggregation is getting mad about seeing self-originated packets actually sent by the XGS1250. It's some sort of strange Realtek loop detection packet, btw:
This is probably because I fried some flash offset on the XGS1250 though.
Last update: port "isolation" is a problem here too, except doing a lot of analysis, it looks like packets are not forwarded correctly, whether that is a vlan or fdb problem is not clear. If I bounce the network interfaces a few times, I can get a DHCP device to receive a reply through the switch, but at a steady state, the DHCP server sees the request and replies, but the reply never makes it through the switch. Tomorrow problems.
Yeah, using the old dts access method is probably better for now.
On the upside; or down I've started to fix this mess and am re-writing the I2C driver (using regmaps too). I'm almost done coding, so can start testing soon. It's a big change and improves things a lot.
The actual transmitting stuff is gonna stay more or less the same, cause that looks pretty much alright.
I obviously have no idea what correct is and how this aligns with the TP-Link 1008F switch @lorenzb has, probably not at all, so the common isn't common (that is fine)
It can be set to whatever you want, it's just what linux reports, its not configuring any hardware. Our PHY is fixed. Changing it to 10 also won't transmit at 10Mbit. The only thing I haven't tried proper (ethtool wouldn't let me for obvious reasons) is change the switch PHY 'force mac' bits, but I doubt that has effect either, but who knows.
We're always communicating at about 150Mbit, e.g. 1G. So if we feel that 10G looks cooler, sure, but it's also a bit misleading
heh, the i2c driver will stil rename it to 0 based when you do 'i2c_adapter_add' so best to un-confuse things early
But this information is hidden in the (new) mux driver anyway.
can you post a snippet of what you mean? it seemed like everything was already included in the current -wip (but it's so messy, it's easy to fix things'
lol yeah, I spotted that too but didn't care enough to do a single typo fix just yet
yep, though I fixed that recently, did I forget to push?
would only work for i2c0 anyway, i2c1 is having some weirdness to it.
yeah, i've tried to put all that stuff in a single commit at the end of the wip.
So, this is the 'hardcoded' default MAC from realtek. This is also already fixed in my branch, but requires we have the proper (u-boot) environment variables.
ethaddr=00:12:34:56:78:90
is a the least needed and pleases both u-boot and linux.
Openwrt (due to DSA) can then assign 'ranges' of macs to each port. if you only have ethaddr, it will pick a 'testing' range matching that mac for the other ports. otherwise, define: mac_start=00:12:34:56:78:00mac_end=00:12:34:56:78:0b
for example to assign these to the DSA ports.
I haven't found where realtek stores the mac, but its in the jffs2_cfg partition (unlike the u-boot-sys partition that I thought of earlier).
note, that we know we can only configure eth0 once, after network restart, the connection is 'lost' (or isolated?)
So I think this may be a DSA issue. If I configure DSA properly (the defaults do not specificy PVID on any port), I noticed the "isolated" traffic has mismatched vid rvid, whatever those are, while the working interfaces don't have this mismatch.
This can be seen in: /sys/kernel/debug/rtl838x/l2_table
Example (not working/isolated): mac 3c:ec:XX:XX:XX:XX vid 0 rvid 1
The actual interfaces:
mac 02:e0:4c:00:00:00 vid 1 rvid 1
mac 02:e0:4c:00:00:01 vid 1 rvid 1
mac 02:e0:4c:00:00:02 vid 1 rvid 1
mac 02:e0:4c:00:00:03 vid 1 rvid 1
mac 02:e0:4c:00:00:04 vid 1 rvid 1
mac 02:e0:4c:00:00:05 vid 1 rvid 1
mac 02:e0:4c:00:00:06 vid 1 rvid 1
mac 02:e0:4c:00:00:07 vid 1 rvid 1
mac 00:e0:4c:00:00:00 vid 1 rvid 1
Going to try to use a untagged pvid of other than 1 and see what happens here.
"realtek: handle changed flags in VLAN configuration" could be it: the default configuration has each port assigned to VLAN1 and VLAN filtering enabled, but no PVID. I'm not sure if that should work or not, but I'd expect each port to have a PVID. Adding one didn't seem to fix anything, so this is interesting.
I also noticed in DSA for rtl838x.c, both e->pvid and e->vid get set in l2_entry addition, but not in rtl930x.c, but I'm still trying to understand the way DSA works on this switch. It does appear that you can see raw switch frames (only?) with tcpdump on eth0, which might help debugging this.
If you mean the default configuration in /etc/config/network, then it probably doesn't matter, as it looks like netifd automatically sets the PVID. You can check the actual configuration using bridge vlan.
Interesting. I checked in Luci, and no PVID was indicated, and thought perhaps that patch would be required to have the PVID updated correctly. Will check with bridge VLAN. I am seeing really weird behavior where packets are successfully egressing through the switch to the network, but responses aren't coming back correctly. I may need to set up something simpler to isolate the problem. I also need to stare at some packet dumps for a while, whatever is going on is weird.
I found another unmanaged device that has an RTL9300: TrendNet TEG-S350.
By holding a pin header against an umarked 4-pin header (the one that doesn't look like a fan header), I was able to get to a diag shell, and then exit to a shell.
From trying some random RTK commands (before my hand got tired, need to solder the header):
RTK.0> switch get probe-information
Unit ID: 0
Chip ID: 9303 (RTL9303)
Family ID: 9300
Port Number: 6
All Port Number: 6, Minimum: 0, Maximum: 28, Ports: 0,8,16,20,24,28
Ether Port Number: 5, Minimum: 0, Maximum: 24, Ports: 0,8,16,20,24
Copper Port Number: 5, Minimum: 0, Maximum: 24, Ports: 0,8,16,20,24
CPU Port : 28
Port PHY chip
======================
0 RTL8226B
8 RTL8226B
16 RTL8226B
20 RTL8226B
24 RTL8226B
/proc/kmsg says:
<5>Linux version 3.18.24 (chris_zhao@ubuntu37) (gcc version 4.8.5 20150209 (prerelease) (Realtek MSDK-4.8.5p1 Build 2536) ) #61 Mon Jul 27 14:23:41 CST 2020
<6>MIPS: machine is RTL9300
<6>bootconsole [early0] enabled
<6>CPU0 revision is: 00019555 (MIPS 34Kc)
<0>[cpu0, rtl9300_auto_probe_memsize:138]: AUTO byte_size = 0x10000000 Byte
<0>[cpu0, prom_memory_size_get:242]: Get total memory size by auto probe result
<0>[cpu0, prom_memory_size_get:245]: Get dma size from kernel commnad line
<0>[cpu0, prom_meminit:293]: DMA size=0x0(B)
<0>[cpu0, prom_meminit:321]: mem zone0: Base=0x0, size=0x10000000(B)
<6>Determined physical RAM map:
<6> memory: 10000000 @ 00000000 (usable)
<0>[cpu0, plat_remove_mem_parameter:112]: cmdline= console=ttyS0,115200, 803c86d8, (null)
<0>[cpu0, plat_remove_mem_parameter:117]: cmdline= console=ttyS0,115200
<0>[cpu0, plat_remove_mem_parameter:112]: cmdline= console=ttyS0,115200, 803c450c, (null)
<0>[cpu0, plat_remove_mem_parameter:117]: cmdline= console=ttyS0,115200
<6>Initrd not found or empty - disabling initrd
<4>Zone ranges:
<4> Normal [mem 0x00000000-0x0fffffff]
<4> HighMem empty
<4>Movable zone start for each node
<4>Early memory node ranges
<4> node 0: [mem 0x00000000-0x0fffffff]
<6>Initmem setup node 0 [mem 0x00000000-0x0fffffff]
<7>On node 0 totalpages: 65536
<7>free_area_init_node: node 0, pgdat 803a99e0, node_mem_map 81000000
<7> Normal zone: 512 pages used for memmap
<7> Normal zone: 0 pages reserved
<7> Normal zone: 65536 pages, LIFO batch:15
<4>Primary instruction cache 32kB, VIPT, 4-way, linesize 32 bytes.
<4>Primary data cache 32kB, 4-way, PIPT, no aliases, linesize 32 bytes
<7>pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
<7>pcpu-alloc: [0] 0
<4>Built 1 zonelists in Zone order, mobility grouping on. Total pages: 65024
<5>Kernel command line: console=ttyS0,115200
<6>PID hash table entries: 1024 (order: 0, 4096 bytes)
<6>Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
<6>Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
<6>Writing ErrCtl register=0001188c
<6>Readback ErrCtl register=0001188c
<4>Memory: 252292K/262144K available (2639K kernel code, 153K rwdata, 952K rodata, 3604K init, 164K bss, 9852K reserved, 0K highmem)
<6>NR_IRQS:128
<6>console [ttyS0] enabled
<6>bootconsole [early0] disabled
<6>Calibrating delay loop... 531.66 BogoMIPS (lpj=2658304)
<6>pid_max: default: 32768 minimum: 301
<6>Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
<6>Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
<6>NET: Registered protocol family 16
<5>SCSI subsystem initialized
<6>usbcore: registered new interface driver usbfs
<6>usbcore: registered new interface driver hub
<6>usbcore: registered new device driver usb
<6>NET: Registered protocol family 2
<6>TCP established hash table entries: 2048 (order: 1, 8192 bytes)
<6>TCP bind hash table entries: 2048 (order: 3, 40960 bytes)
<6>TCP: Hash tables configured (established 2048 bind 2048)
<6>TCP: reno registered
<6>UDP hash table entries: 256 (order: 1, 12288 bytes)
<6>UDP-Lite hash table entries: 256 (order: 1, 12288 bytes)
<6>NET: Registered protocol family 1
<6>futex hash table entries: 256 (order: 0, 7168 bytes)
<6>msgmni has been set to 492
<5>random: modprobe urandom read with 1 bits of entropy available
<6>io scheduler noop registered
<6>io scheduler deadline registered
<6>io scheduler cfq registered (default)
<7>start plist test
<7>end plist test
<6>Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled
<6>serial8250: ttyS0 at MMIO 0x0 (irq = 47, base_baud = 10764700) is a 16550A
<4>RTK_SPI_FLASH_MIO driver is bypassed
<4>RTK_NORSFG3 driver is used
<5>=================================================================
<5>init_luna_nor_spi_map: flash map at 0xb4000000
<4>SPI NOR driver probe...
<4>MXIC/C22018/MMIO16-1/ModeC add SPI NOR partition
<4>MTD partitions obtained from built-in array
<5>Creating 7 MTD partitions on "rtk_norsf_g3":
<5>0x000000000000-0x0000000e0000 : "LOADER"
<5>0x0000000e0000-0x0000000f0000 : "BDINFO"
<5>0x0000000f0000-0x000000100000 : "SYSINFO"
<5>0x000000100000-0x000000200000 : "JFFS2 CFG"
<5>0x000000200000-0x000000300000 : "JFFS2 LOG"
<5>0x000000300000-0x000000980000 : "RUNTIME"
<5>0x000000980000-0x000001000000 : "RUNTIME2"
<5>=================================================================
<6>usbcore: registered new interface driver r8152
<6>ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
<4>rtk_gen1_hcd_cs_init: rtk_gen1_hcd_cs_init()!
<4>rtk_gen1_hcd_cs_init: register rtk_gen1_ehci ok!
<4>usb_phy_configure_process: usb_phy_configure_process()!
<6>rtk_gen1-ehci rtk_gen1-ehci: Realtek On-Chip EHCI Host Controller
<6>rtk_gen1-ehci rtk_gen1-ehci: new USB bus registered, assigned bus number 1
<6>rtk_gen1-ehci rtk_gen1-ehci: irq 28, io mem 0x18021000
<6>rtk_gen1-ehci rtk_gen1-ehci: USB 2.0 started, EHCI 1.00
<6>hub 1-0:1.0: USB hub found
<6>hub 1-0:1.0: 1 port detected
<6>usbcore: registered new interface driver usbtmc
<6>usbcore: registered new interface driver usb-storage
<6>TCP: cubic registered
<6>NET: Registered protocol family 17
<6>Freeing unused kernel memory: 3604K (803ab000 - 80730000)
<4>RTCORE LKM Insert...
<4>RTCORE Driver Module Initialize
<4> IOAL init
<4> Log init
<4> Hardware-profile probe (Forced profile: RTL9303_5X8226)
<4> (RTL9303_5X8226)
<4> Hardware-profile init
<4> Intr Probe (unit 0)
<4> GPIO probe (unit 0): (found)
<4> GPIO Init
<4> SPI init (unit 0) (type3)
<4> SPI Master init
<4> TC probe (unit 0): (found)
<4> TC init (unit 0)
<4> TC util init (unit 0)
<4> TC util init (isr)
<4> Watchdog probe (unit 0): (found)
<4> Watchdog init (unit 0)
<4> I2C probe (unit 0)
<4> I2C init (unit 0)
<4> RTL8231 probe (unit 0): (found)
<4> RTL8231 init (unit 0)
<4> NIC probe (unit 0)
<4> IOAL init
<4> L2Ntfy probe (unit 0): (found)
<4>RTK Driver Module Initialize
<4> MAC probe (unit 0)
<4> Chip 9303 (found)
<4> MAC init (unit 0)
<4> SMI protocol probe (unit 0)
<4> PHY probe (unit 0)
<4> Chip Construct (unit 0)
<4> Chip Construct
<4> Disable PHY Polling
<4> PHY Reset
<4> MAC Construct
<4> Turn Off Serdes
<4> Serdes Construct
<4> PHY Construct
<4> Turn On Serdes
<4> Enable PHY Polling
<4>ttyS0: 78 input overrun(s)
<4> Misc
<4> PHY init (unit 0)
<4> Mgmt_dev init (unit 0)
<4>RTNIC Driver Module Initialize
<4>RTDRV Driver Module Initialize
<5>random: nonblocking pool is initialized
How hard would this be to transform into a managed switch?
The one on the right is the UART (baud rate 115200), and the pinout seems to be, starting from the arrow, VCC, TX, RX, GND. The one on the left seems to be either a fan header or a power input, since it has two thick traces (each covering two pins), and one of them leads to the unpopulated capacitor.
It is uboot, and it can be interrupted. There is an "rtk network" command that outputs the same "RTCORE Driver Module Initialize" block; any other commands you want me to try?
U-Boot 2011.12.(TRUNK_CURRENT)-svn102528 (Jul 27 2020 - 13:49:38)
Board: RTL9300 CPU:800MHz LX:175MHz DDR:600MHz
DRAM: 256 MB
SPI-F: MXIC/C22018/MMIO16-1/ModeC 1x16 MB (plr_flash_info @ 83f8c548)
Loading 65536B env. variables from offset 0xe0000
*** Warning - bad CRC, using default environment
Net: Net Initialization Skipped
No ethernet found.
Hit Esc key to stop autoboot: 0
RTL9300# # help
? - alias for 'help'
base - print or set address offset
boota - boota - boot application image from one of dual images partition automatically
bootm - boot application image from memory
bootp - boot image via network using BOOTP/TFTP protocol
cmp - memory compare
cp - memory copy
crc32 - checksum calculation
env - environment handling commands
erase - erase FLASH memory
flerase - Erase flash partition
flinfo - print FLASH memory information
flshow - Show flash partition layout
go - start application at address 'addr'
help - print command description/usage
iminfo - print header information for application image
loadb - load binary file over serial line (kermit mode)
loads - load S-Record file over serial line
loady - load binary file over serial line (ymodem mode)
loop - infinite loop on address range
md - memory display
mm - memory modify (auto-incrementing address)
mtest - simple RAM read/write test
mw - memory write (fill)
nm - memory modify (constant address)
ping - send ICMP ECHO_REQUEST to network host
printenv- print environment variables
printsys- printsys - print system information variables
protect - enable or disable FLASH write protection
reset - Perform RESET of the CPU
reset_all- Perform whole chip RESET of the CPU
rtk - rtk - Realtek commands
run - run commands in an environment variable
saveenv - save environment variables to persistent storage
savesys - savesys - save system information variables to persistent storage
setenv - set environment variables
setsys - setsys - set system information variables
sf - SPI flash sub-system
sleep - delay execution for some time
tftpboot- boot image via network using TFTP protocol
upgrade - Upgrade loader or runtime image
version - print monitor, compiler and linker version
RTL9300# # printenv
baudrate=115200
boardmodel=RTL8393M_DEMO
bootcmd=boota
bootdelay=1
ethaddr=00:E0:4C:00:00:00
ipaddr=192.168.1.1
ledModeInitSkip=0
serverip=192.168.1.111
stderr=serial
stdin=serial
stdout=serial
Environment size: 217/65532 bytes
EDIT: I tried connecting a network cable and running "rtk network" -- no link. And trying to tftpboot doesn't send any traffic with the macaddr from the env var.
And from the booted kernel, if I try to ping something, the ping target can see ARP requests, but the switch's kernel never seems to see the reply.
Really, the biggest thing I would want out of the switch is for it to pass all VLAN tags unaltered. Currently, it seems to discard all tagged packets.