I've just pushed something to my realtek-next branch
I've resolved this, which was related to SMP being enabled. This should of course work, but atm does't. S i've disabled all SMP stuff from the kernel config and experiment with that next. What works? Idk i can ping, the SFP detects stuff, but I haven't tested much yet, as I was just too happy that it actually boots again and I pin-pointed the booting issue.
My next branch includes the latest versions of @plappermaul's clock and timer patches (I hope :D) but haven't even looked into if it is actually working.
Btw, I'm super interested in seeing how it behaves on rtl83xx as this branch is a go in unifying the realtek target as much as possible.
P.S. on the initramfs image, I'm occasionally seeng watchdog timeouts. I usually let it boot just enough so that i can run uci set network.lan.proto=dhcp
so that when the dropbear keygen finishes and continues with network bringup, it'll try to get a dhcp lease. Doubt it's related; but not seeing where this is coming from yet either (the logs are quite quiet ...) and it also happens with static config (out of the box config)
To give a little more detail:
[ 40.712032] rtl83xx-switch switch@1b000000 lan4: configuring for phy/xgmii link mode
[ 40.720662] rtl93xx_phylink_mac_config port 3, mode 0, phy-mode: xgmii, speed -1, link 0
[ 40.729708] rtl93xx_phylink_mac_config SDS is -1
[ 40.735064] 8021q: adding VLAN 0 to HW filter on device lan4
[ 40.752387] switch: port 4(lan4) entered blocking state
[ 40.758206] switch: port 4(lan4) entered disabled state
[ 40.764881] device lan4 entered promiscuous mode
[ 40.780816] rtl83xx-switch switch@1b000000 lan5: configuring for phy/xgmii link mode
[ 40.789499] rtl93xx_phylink_mac_config port 4, mode 0, phy-mode: xgmii, speed -1, link 0
[ 40.798524] rtl93xx_phylink_mac_config SDS is -1
[ 40.804033] 8021q: adding VLAN 0 to HW filter on device lan5
[ 40.821403] switch: port 5(lan5) entered blocking state
[ 40.827220] switch: port 5(lan5) entered disabled state
[ 40.833896] device lan5 entered promiscuous mode
[ 69.387944] realtek-otto-watchdog 18003260.watchdog: phase 1 timeout
but the port number varies where it times out, I've seen it timeout after port2, but also after port 7. For some reason, we stop to enable/setup ports. Not even sure if this is 'added to vlan' or something else. Can't even say if this is related to master, birger's patches, plappermaul's patches or mine. I do know that 'before' with an older master and just birgers patches everything seemed fine, so I'll go bisect a few things again first.
In hindsight, while this should be fixed; it might be related with me disabeling SMP, VPE and the CEVT timer stuff, so that is maybe an interesting 'bug-catcher'?
edit: So I've changed some things, but nothing relevant afaik; yet, I've not been able to reproduce the issue today. If it crops up again later; I'll see if I can nail the issue down, but with it being so intermittent, that'll be fun
So I narrowed it down a little; as reported (somewhere) earlier, we have a few sh: out of range
errors during boot, this comes from the /etc/board.d/02_network
script when adding a network MAC address to the json /etc/board.json
, but after the board_config_dump
. For some reason, we get this error, though I have not yet pinpointed why. Not adding the mac address for the lan ports, removes this error, but causes the board to timeout (pretty much guaranteed), maybe due to duplicate mac addresses? However adding the lan mac addresses to the json, even done manually instead of that loop, causes that sh error. Not adding all of them, causes again, the watchdog timeout ...
Still no idea how it all connects together though.