Adding Support for Verizon CR1000A

That print is from the SSDK SFP driver, not the kernel one.

SSDK one should already be enabled but as far as I know it requires the SFP PHY ID to be set in the DTS and PHY media type

Yes, I found it there. But I only started getting it after I've enabled kmod-sfp ?

I'm looking at how exactly DTS should look for SFP. Appreciate any hints to speed it up

Update:

found this

and tried the dts

&switch {
 	...
	qcom,port_phyinfo {
		port@4 {
			port_id = <5>;
			phy_address = <0x0f>;
			media-type = "sfp";
		};

which gets parsed properly but still no probing

root@OpenWrt:/# dmesg | grep sfp
[    3.335894] ssdk_dt_parse_phy_info[728]:INFO:[PORT 5] media type is sfp
[    3.485465] sfp_phy_init[229]:INFO:sfp phy init for port 0x20!

same happens for media-type = "sfp_sgmii";

[    3.327359] ssdk_dt_parse_phy_info[732]:INFO:[PORT 5] media type sfp support sfp_sgmii
[    3.465766] sfp_phy_init[229]:INFO:sfp phy init for port 0x20!

can't find anything relevant to this in the code link above

That's the phy_address property.

I seriously can't really tell you what to try, this is entirely untested territory.

understood, however the QCA SFP driver looks to be a standard linux phy driver registered
here: https://git.codelinaro.org/clo/qsdk/oss/lklm/qca-ssdk/-/blob/NHSS.QSDK.12.1.5.r3/src/hsl/phy/sfp_phy.c#L126

but for some reason it is registered with phy_id = 0xaaaabbbb ( https://git.codelinaro.org/clo/qsdk/oss/lklm/qca-ssdk/-/blob/NHSS.QSDK.12.1.5.r3/include/hsl/phy/hsl_phy.h#L603)

changing the phy_id to match one on MDIO

root@OpenWrt:/# mdio 9*
 DEV      PHY-ID  LINK
...
0x0f  0x02434771  down

actually makes it matching and moves one us step further... and now it crashes in probe :slight_smile:

the stack trace is rather confusing though. i'd expect it to contain sfp_phy_probe()
but is has sfp_data_set instead?

[    3.409859] aquantia_phy_api_ops_init[2241]:INFO:qca probe aquantia phy driver succeeded!
[    3.415620] sfp_phy_init[229]:INFO:sfp phy init for port 0x20!
[    3.429274] Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000448
[    3.429318] Mem abort info:
[    3.439507]   ESR = 0x0000000096000005
[    3.442184]   EC = 0x25: DABT (current EL), IL = 32 bits
[    3.446005]   SET = 0, FnV = 0
[    3.451481]   EA = 0, S1PTW = 0
[    3.454333]   FSC = 0x05: level 1 translation fault
[    3.457375] Data abort info:
[    3.462239]   ISV = 0, ISS = 0x00000005
[    3.465360]   CM = 0, WnR = 0
[    3.468927] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000043bf4000
[    3.472049] [0000000000000448] pgd=080000004391d003, p4d=080000004391d003, pud=080000004391d003, pmd=0000000000000000
[    3.478488] Internal error: Oops: 96000005 [#1] SMP
[    3.489061] Modules linked in: qca_ssdk(+) gpio_button_hotplug ext4 mbcache jbd2 aquantia hwmon crc32c_generic
[    3.493759] CPU: 2 PID: 465 Comm: kmodloader Not tainted 5.15.112 #0
[    3.503821] Hardware name: Verizon CR1000A (DT)
[    3.510327] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    3.514586] pc : sfp_data_set+0x9c/0x120 [qca_ssdk]
[    3.521525] lr : phy_probe+0x64/0x1b0
[    3.526384] sp : ffffffc009b935f0
[    3.530202] x29: ffffffc009b935f0 x28: ffffffc00092c010 x27: 0000000000000020
[    3.533508] x26: ffffffc0009304f0 x25: ffffffc00086d000 x24: ffffff80031b2968
[    3.540627] x23: 000000000000000d x22: ffffff80064d7800 x21: ffffffc000923b10
[    3.547745] x20: 0000000000000000 x19: ffffff80064d7800 x18: 0000000000000000
[    3.554863] x17: 0000000000000001 x16: 0000000000000001 x15: ffffffffffffffff
[    3.561981] x14: ffffff8002adc48a x13: 0066303a312d6f69 x12: 646d2e3030303039
[    3.569099] x11: 0000000000000003 x10: 0101010101010101 x9 : 0000000000000001
[    3.576217] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : 4240455144534d48
[    3.583335] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000004
[    3.590453] x2 : 0000000000000001 x1 : 000000000000000f x0 : ffffff80064d7800
[    3.597573] Call trace:
[    3.604681]  sfp_data_set+0x9c/0x120 [qca_ssdk]
[    3.606942]  phy_probe+0x64/0x1b0
[    3.611454]  really_probe.part.0+0x9c/0x30c
[    3.614927]  __driver_probe_device+0x98/0x170
[    3.618920]  driver_probe_device+0x44/0x120
[    3.623433]  __driver_attach+0x94/0x190
[    3.627425]  bus_for_each_dev+0x60/0xa0
[    3.631245]  driver_attach+0x24/0x30
[    3.635065]  bus_add_driver+0x108/0x1fc
[    3.638883]  driver_register+0x78/0x130
[    3.642443]  phy_driver_register+0x90/0xc4
[    3.646263]  sfp_phy_init+0xe8/0x130 [qca_ssdk]
[    3.650432]  ssdk_phy_driver_init+0xec/0x180 [qca_ssdk]
[    3.654860]  ssdk_init+0x5c/0xb0 [qca_ssdk]
[    3.660067]  init_module+0x318/0x1000 [qca_ssdk]
[    3.664234]  do_one_initcall+0x50/0x1c0
[    3.669093]  do_init_module+0x44/0x1d0
[    3.672652]  load_module+0x1ba4/0x23c0
[    3.676471]  __do_sys_init_module+0x190/0x250
[    3.680206]  __arm64_sys_init_module+0x1c/0x30
[    3.684633]  invoke_syscall.constprop.0+0x5c/0x104
[    3.688972]  do_el0_svc+0x6c/0x15c
[    3.693745]  el0_svc+0x18/0x54
[    3.697129]  el0t_64_sync_handler+0xe8/0x114
[    3.700171]  el0t_64_sync+0x184/0x188
[    3.704600] Code: a90153f3 aa0003f3 f9420814 b942e801 (39512280) 
[    3.708162] ---[ end trace 3a3263505cc4ab62 ]---
[    3.714235] Kernel panic - not syncing: Oops: Fatal exception
[    3.718924] SMP: stopping secondary CPUs
[    3.724568] Kernel Offset: disabled
[    3.728553] CPU features: 0x00000000,00000802
[    3.731768] Memory Limit: none
[    3.736281] Rebooting in 1 seconds..


Well, it actually does call sfp_phy_probe() but it crashed here because the priv pointer is null...

What's strange is that i do see it set here earlier (confirmed by debug logs)

Looks like private state is not preserved/lost?

It would probably be good idea to see if in the GPL dump vendor modified anything relating to SFP PHY?

Working on it; will take a while anyway

why are we not using the latest SSDK version? 12.4.r1. I see its under active development?

Because clock parenting and rate setting got broke with it.

I did experiment with it since it would allow dropping a decent number of patches

i guess i made some progress? (via a small hack for restoring priv pointer in probe method)

[    3.480704] aquantia_phy_api_ops_init[2241]:INFO:qca probe aquantia phy driver succeeded!
[    3.486288] sfp_phy_init[253]:INFO:sfp phy init for port 0x20!
...
[    3.597101] sfp_phy_probe[85]:INFO:sfp phy is probed!
...
[    5.317794] GMAC5(ffffff80030b98c0) Invalid MAC@ - using fe:7b:e4:81:54:97
[    5.321734] QCA SFP 90000.mdio-1:0f: attached PHY driver (mii_bus:phy_addr=90000.mdio-1:0f, irq=POLL)
^^^^ 
[    5.329183] nss-dp 3a003000.dp5-syn lan: Registered netdev lan(qcom-id:5)
[    5.337965] GMAC6(ffffff80030be8c0) Invalid MAC@ - using 8e:f1:c1:be:b5:cb
[    5.344723] Aquantia AQR113C 90000.mdio-1:08: FW 5.6, Build 7, Provisioning 1
[    5.356762] Aquantia AQR113C 90000.mdio-1:08: attached PHY driver (mii_bus:phy_addr=90000.mdio-1:08, irq=POLL)
[    5.359204] nss-dp 3a007000.dp6-syn wan: Registered netdev wan(qcom-id:6)
[    5.368563] **********************************************************

and ethtool reports

root@OpenWrt:/# ethtool lan
Settings for lan:
	Supported ports: [ FIBRE ]
	Supported link modes:   100baseT/Full
	                        1000baseT/Full
	                        10000baseT/Full
	                        2500baseT/Full
	Supported pause frame use: Symmetric Receive-only
	Supports auto-negotiation: Yes
	Supported FEC modes: Not reported
	Advertised link modes:  100baseT/Full
	                        1000baseT/Full
	                        10000baseT/Full
	Advertised pause frame use: Symmetric Receive-only
	Advertised auto-negotiation: Yes
	Advertised FEC modes: Not reported
	Speed: Unknown!
	Duplex: Unknown! (255)
	Auto-negotiation: on
	Port: Twisted Pair
	PHYAD: 15
	Transceiver: external
	MDI-X: Unknown
	Link detected: no

but it stopped registering RX packets...

also, why does mdio think the link is down? even for fully operational wan port?

root@OpenWrt:/# mdio 9*
 DEV      PHY-ID  LINK
0x08  0x00000000  down
0x0f  0x02434771  down

Mdio command is just interpreting the standard C22 registers.

Where does it take dev (0x8 and 0xf) from? 0x8 is correct for AQR but spf phy sits on 0x04 according to OEM dts. Once I changed the address the ssdk started to recognize sfp correctly without any patches from my side

but! The patch which uses phy-handle to connect to phy (instead of using qcoms properties) failes to find proper sfp PHY and crashes kernel with NPE when it tries to print phy device to log.

If I comment the print statement it semi-works, actually:

It boots, it senses the link is up for that sfp PHY. It registers RX packet for each ping I send from remote host connected to any rtl9301 port. It even registers some TX? Not sure why and how.

However the ethtool is all messed up. It could only say that link is up basically.

1 Like

@robimarko how do I force max debugging level in ssdk and nss-dp?

I think that just increasing the kernel log level to 8 should work.

What do you mean where does the MDIO take the addr from?
It scans the MDIO bus from 0x0 to 0x20 and just reports if it gets a reply, its doing the same C22 probe that the kernel does.

So there if no reply from 0x4 (where sfp sits, according to OEM dts), but there is reply from. 0xf where nothing is supposed to be?

SFP/RTL switch is probably not C22 compliant.

Its really hard to guess, especially since its not an SFP at all but it seems that OEM is abusing the SFP support somehow for the RTL

1 Like