Support for ClearFog GT 8K?

These patches are already in mainline kernel and the author of those patches sent pull request to OpenWrt, but nobody cares about it.

These SoC have completely different PCIe driver and that series is unrelated, from the looks, the only related thing is Compex cards not detected when the reset signal is not triggered in kernel.

Can't say much about Armada 8040/ClearFog GT-8K, maybe additional changes to kernel dts are needed and U-Boot modification of MPP's to leave them for kernel to trigger reset? Don't know much, since I don't own this device.

Thank you both for your inputs. I'll try to discuss this further with SolidRun and see what their input on this is. Not sure if they are willing to invest more time, if the issue only affects the Compex cards.

Reviving this thread, @Skeletor, your patches still work for the mainline. Are you planning a PR with them? I realize the pcie modification might not work as you desire, but I have your patches applied to the head of openwrt and they work. If you are not interested in a PR I am interested in trying.

Recently there has also been some activity in the solid run forum about this (https://community.solid-run.com/t/booting-openwrt-on-clearfog-gt8k/313/9). Not sure if you are the same guy, if yes - sorry for not responding earlier, if no - might be worth combining the efforts.

Either way, glad to hear the patches work for you. Feel free to go ahead with the PR if you are up to it. I was intending to do it when I resumed work on the Clearfog this year, but I currently don't seem to be able to find the time to put everything together. If you can manage it would be great to get this merged.

Concerning the PCIe modifications: I managed to get this working as intended now (only device tree patches necessary). The issue was the used configuration for the Arm Trusted Firmware build. The Clearfog GT8K build instructions point to the macchiatobin configuration, however this one doesn't reset both PCIe slots properly.

I would assume you are not interested in using 2 PCIe slots, so I would suggest for the PR to use the standard (1xPCIe + 1xSATA) configuration. That should limit the changes to configuration files which would make for a nice and simple PR.
The information for how to get 2xPCIe working can be added to a wiki page or something once the PR is accepted by the community.

Hi, where can I find this info?

Currently on my hard drive at home, I didn't push them to any git repository :innocent:

I can provide the patches once I get back and have time to put the patchwork together.

Basically you need to:

  • Modify DTS in u-boot to switch the currently used as SATA SERDES lane to PCIe
  • Modify ATF source code (add PCIe initialization code for Clearfog GT8K for both PCIe slots
  • Modify DTS in kernel to switch the currently used as SATA SERDES lane to PCIe

I've had some troubles with newer ATF / u-boot versions (Marvell doesn't seem to properly maintain the ARMADA CPU range anymore), so I stuck to the ones from the Solid Run build instructions (https://solidrun.atlassian.net/wiki/spaces/developer/pages/287178828/A8040+U-Boot).
Note that the ATF build configuration a80x0_mcbin, initializes 1 PCIe slot properly so it serves as a good starting point.

Same guy, no worries on response times :slight_smile: I have had this board far too long for it not to be used. So I am motivated.

@mrbojangles: Saw you created the PR, good luck with getting it merged. Let me know here if you need some help - I get e-mail notifications for questions.

@adrian_dsl: I put together the patches required to get the SATA port reconfigured as PCIe: https://github.com/BenjKeiser/clearfog_gt8k_patches/tree/main .

1 Like

Thanks,

I tried the patches but running into some strange issues.

BootROM - 2.03
Starting CP-0 IOROM 1.07
Booting from SD 0 (0x29)
Found valid image at boot postion 0x000
lNOTICE:  Starting binary extension
NOTICE:  SVC: DEV ID: 8040, FREQ Mode: 0x1
NOTICE:  SVC: AVS work point changed from 0x29 to 0x29
mv_ddr: mv_ddr-devel-18.12.0-g618dadd-dirty (Jun 08 2022 - 22:54:31)
warning: mv_ddr4_dq_vref_calibration: subphy 7 vref tap 67 voltage noise
mv_ddr: completed successfully
NOTICE:  Cold boot
NOTICE:  Booting Trusted Firmware
NOTICE:  BL1: v1.5(release):1f8ca7e0-dirty (Marvell-devel-18.12.2)
NOTICE:  BL1: Built : 22:54:36, Jun  8 2022
NOTICE:  BL1: Booting BL2
NOTICE:  BL2: v1.5(release):1f8ca7e0-dirty (Marvell-devel-18.12.2)
NOTICE:  BL2: Built : 22:54:38, Jun  8 2022
BL2: Initiating SCP_BL2 transfer to SCP
NOTICE:  SCP_BL2 contains 5 concatenated images
NOTICE:  Skipping MSS CP3 related image
NOTICE:  Skipping MSS CP2 related image
NOTICE:  Load image to CP1 MSS AP0
NOTICE:  Loading MSS image from addr. 0x40269f4 Size 0x1cd8 to MSS at 0xf4280000
NOTICE:  Done
NOTICE:  Load image to CP0 MSS AP0
NOTICE:  Loading MSS image from addr. 0x40286cc Size 0x1cd8 to MSS at 0xf2280000
NOTICE:  Done
NOTICE:  Load image to AP0 MSS
NOTICE:  Loading MSS image from addr. 0x402a3a4 Size 0x5420 to MSS at 0xf0580000
NOTICE:  Done
NOTICE:  SCP Image doesn't contain PM firmware
NOTICE:  BL1: Booting BL31
lNOTICE:  MSS PM is not supported in this build
NOTICE:  BL31: v1.5(release):1f8ca7e0-dirty (Marvell-devel-18.12.2)
NOTICE:  BL31: Built : 22:54:40, Jun  8 2022
   
<debug_uart>


   
U-Boot 2022.04-dirty (Jun 08 2022 - 22:53:22 +0300)
   
DRAM:  4 GiB
Core:  55 devices, 20 uclasses, devicetree: separate
Comphy chip #0:
Comphy-0: PEX0
Comphy-1: UNCONNECTED
Comphy-2: SFI0
Comphy-3: UNCONNECTED
Comphy-4: USB3_HOST1
Comphy-5: UNCONNECTED
Comphy chip #1:
Comphy-0: PEX0
Comphy-1: UNCONNECTED
Comphy-2: USB3_HOST0
Comphy-3: SGMII1        1.25 Gbps
Comphy-4: UNCONNECTED
Comphy-5: SGMII2        3.125 Gbps
UTMI PHY 0 initialized to USB Host0
SATA link 0 timeout.
SATA link 1 timeout.
AHCI 0001.0000 32 slots 2 ports 6 Gbps 0x3 impl SATA mode
flags: 64bit ncq led only pmp fbss pio slum part sxs
PCIE-0: Link up (Gen2-x1, Bus0)
PCI: Failed autoconfig bar 10
PCIE-2: Link up (Gen2-x1, Bus2)
PCI: Failed autoconfig bar 10
MMC:   sdhci@6e0000: 0, sdhci@780000: 1
Loading Environment from SPIFlash... SF: Detected w25q64cv with page size 256 Bytes, erase size 4 KiB, total 8 MiB
OK 
Model: ClearFog-GT-8K
Net:
Warning: mvpp2-0 (eth0) using random MAC address - 8a:60:a5:d6:21:93
eth0: mvpp2-0
Warning: mvpp2-4 (eth1) using random MAC address - 62:1c:49:ec:81:ce
, eth1: mvpp2-4
Warning: mvpp2-5 (eth2) using random MAC address - ee:e3:e0:08:68:72
, eth2: mvpp2-5
Hit any key to stop autoboot:  0
switch to partitions #0, OK
mmc1 is current device
Scanning mmc 1:1...
Found U-Boot script /boot.scr
"Synchronous Abort" handler, esr 0x96000046
elr: 000000000002b180 lr : 000000000002b15c (reloc)
elr: 000000007ff62180 lr : 000000007ff6215c
x0 : 0000000000000000 x1 : 0000000056190527
x2 : 00000000000f4215 x3 : 0000000004d00000
x4 : 0000000000000016 x5 : 000000007fb28aa8
x6 : 0000000000000016 x7 : 000000007ffcd3a8
x8 : 000000007fc424a0 x9 : 0000000000000008
x10: 0000000000000001 x11: 0000000000000006
x12: 0000000000007e29 x13: 0000000000000001
x14: 000000007fb21e80 x15: 000000007fb21e80
x16: 000000007ff61ccc x17: 000000007ffc7667
x18: 000000007fb26dc0 x19: 000000007fbab5c0
x20: 000000007fb1fd68 x21: 0000000000000000
x22: 00000000000f4215 x23: 0000000000000000
x24: 0000000000000020 x25: 0000000000000000
x26: 0000000000000008 x27: 0000000000000030
x28: 000000007ffc0000 x29: 000000007fb1fca0
   
Code: 540000e1 f9400661 b9402021 d5033fbf (b8206861)
Resetting CPU ...
   
resetting ...

Edit: solved by resetting env to default.

Hey, I ran into the same issue with u-boot 2022.04. Did not have time to investigate so I reverted to the 2019 build. Good to hear resetting the env to default solves the issue!

This board is now supported in upstream.

https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=36e46c3c131cb187e94df9bb4c1ef56e3376c268

but I noticed this board is no longer listed for sale :frowning:

HI, Today I tried building from master since this device was merged.
It seems that after adding sata to pcie patch the device hangs for me. In U-Boot all seems ok, I can see the device ID of the plugged card in both ports.

I'm trying the 5.15 kernel with this patch added to target/linux/mvebu/patches-5.15/

100-clearfog_gt-8k-dts-pcie1-linux.patch:

diff --git a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts
index 11b36c317511..d792a3b72718 100644
--- a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts
+++ b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts
@@ -222,11 +222,16 @@
 		marvell,function = "gpio";
 	};
 
-	cp0_wlan_disable_pins: wlan-disable-pins {
+	cp0_wlan0_disable_pins: wlan0-disable-pins {
 		marvell,pins = "mpp51";
 		marvell,function = "gpio";
 	};
 
+	cp0_wlan1_disable_pins: wlan1-disable-pins {
+		marvell,pins = "mpp52";
+		marvell,function = "gpio";
+	};
+
 	cp0_sdhci_pins: sdhci-pins {
 		marvell,pins = "mpp55", "mpp56", "mpp57", "mpp58", "mpp59",
 			       "mpp60", "mpp61";
@@ -236,7 +241,7 @@
 
 &cp0_pcie0 {
 	pinctrl-names = "default";
-	pinctrl-0 = <&cp0_pci0_reset_pins &cp0_wlan_disable_pins>;
+	pinctrl-0 = <&cp0_pci0_reset_pins &cp0_wlan0_disable_pins>;
 	reset-gpios = <&cp0_gpio2 0 GPIO_ACTIVE_LOW>;
 	phys = <&cp0_comphy0 0>;
 	phy-names = "cp0-pcie0-x1-phy";
@@ -342,14 +347,13 @@
 	};
 };
 
-&cp1_sata0 {
-	pinctrl-0 = <&cp0_pci1_reset_pins>;
+&cp1_pcie0 {
+	pinctrl-names = "default";
+	pinctrl-0 = <&cp0_pci1_reset_pins &cp0_wlan1_disable_pins>;
+	reset-gpios = <&cp0_gpio2 1 GPIO_ACTIVE_LOW>;
+	phys = <&cp1_comphy0 0>;
+	phy-names = "cp1-pcie0-x1-phy";
 	status = "okay";
-
-	sata-port@1 {
-		phys = <&cp1_comphy0 1>;
-		phy-names = "cp1-sata0-1-phy";
-	};
 };
 
 &cp1_mdio {

When building with 5.10 kernel it works fine.

I've only did a built with the release tag so kernel 5.10. Can you provide a log from the bootup to see where it hangs?

Just to be sure, without the patch the device starts up ok with Kernel 5.15?

Yes, the device started ok on 5.15 without the patch. On 5.10 it works with the patch or without.

I don't have the exact log anymore, but it was hanging just as the wifi card on second pci port was about to be initialized.

Thu Oct 13 13:38:06 2022 kern.info kernel: [    6.724690] mt7915e 0000:01:00.0: WM Firmware Version: ____000000, Build Time: 20211222184052
Thu Oct 13 13:38:06 2022 kern.info kernel: [    6.754574] mt7915e 0000:01:00.0: WA Firmware Version: DEV_000000, Build Time: 20211222184111
**here it would usually hang on 5.15 with the patch**
Thu Oct 13 13:38:06 2022 kern.info kernel: [    7.046316] mt7915e 0001:01:00.0: HW/SW Version: 0x8a108a10, Build Time: 20211222184017a
Thu Oct 13 13:38:06 2022 kern.info kernel: [    7.046316]
Thu Oct 13 13:38:06 2022 kern.info kernel: [    7.063911] mt7915e 0001:01:00.0: WM Firmware Version: ____000000, Build Time: 20211222184052
Thu Oct 13 13:38:06 2022 kern.info kernel: [    7.087087] mt7915e 0001:01:00.0: WA Firmware Version: DEV_000000, Build Time: 20211222184111

Another issue I'm facing and happens very often with 2 PCI cards connected. It happens less often with only one card connected. When this happens device is unresponsive and the only way to recover is to cut power. I got this log on UART.


[ 1613.607084] rcu:     1-...0: (3 ticks this GP) idle=7b2/1/0x4000000000000000                                                                                                                                                              softirq=11591/11593 fqs=1050
[ 1613.616521] rcu:     3-...0: (3 ticks this GP) idle=602/1/0x4000000000000000                                                                                                                                                              softirq=11114/11116 fqs=1050
[ 1613.625956]  (detected by 0, t=2102 jiffies, g=14753, q=42)
[ 1613.631554] Task dump for CPU 1:
[ 1613.634796] task:kworker/1:1     state:R  running task     stack:    0 pid: 5                                                                                                                                                             628 ppid:     2 flags:0x0000000a
[ 1613.644774] Workqueue: events dbs_work_handler
[ 1613.649239] Call trace:
[ 1613.651700]  __switch_to+0x9c/0xfc
[ 1613.655119]  process_one_work+0x1f0/0x380
[ 1613.659148]  worker_thread+0x70/0x4c4
[ 1613.662828]  kthread+0x120/0x124
[ 1613.666071]  ret_from_fork+0x10/0x20
[ 1613.669661] Task dump for CPU 3:
[ 1613.672903] task:kworker/3:6     state:R  running task     stack:    0 pid:                                                                                                                                                               759 ppid:     2 flags:0x0000000a
[ 1613.682874] Workqueue: events_freezable_power_ thermal_zone_device_check
[ 1613.689604] Call trace:
[ 1613.692062]  __switch_to+0x9c/0xfc
[ 1613.695480]  0xffffff8100b46900
[ 1676.647648] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 1676.653606] rcu:     1-...0: (3 ticks this GP) idle=7b2/1/0x4000000000000000                                                                                                                                                              softirq=11591/11593 fqs=4197
[ 1676.663043] rcu:     3-...0: (3 ticks this GP) idle=602/1/0x4000000000000000                                                                                                                                                              softirq=11114/11116 fqs=4197
[ 1676.672478]  (detected by 0, t=8407 jiffies, g=14753, q=220)
[ 1676.678162] Task dump for CPU 1:
[ 1676.681404] task:kworker/1:1     state:R  running task     stack:    0 pid: 5                                                                                                                                                             628 ppid:     2 flags:0x0000000a
[ 1676.691382] Workqueue: events dbs_work_handler
[ 1676.695848] Call trace:
[ 1676.698309]  __switch_to+0x9c/0xfc
[ 1676.701728]  process_one_work+0x1f0/0x380
[ 1676.705756]  worker_thread+0x70/0x4c4
[ 1676.709435]  kthread+0x120/0x124
[ 1676.712678]  ret_from_fork+0x10/0x20
[ 1676.716267] Task dump for CPU 3:
[ 1676.719508] task:kworker/3:6     state:R  running task     stack:    0 pid:                                                                                                                                                               759 ppid:     2 flags:0x0000000a
[ 1676.729479] Workqueue: events_freezable_power_ thermal_zone_device_check
[ 1676.736210] Call trace:
[ 1676.738668]  __switch_to+0x9c/0xfc
[ 1676.742086]  0xffffff8100b46900

Hmm wonder if your issue stems from an incompatibility (e.g. power consumption) / issue with your PCIe cards / driver.

I'm using two Compex WLE600VX in my ClearFog and have no stability issues with them.
Yours seem to be very new and I didn't find much information about them from a quick google search...

Do you have any cards beside the MediaTek ones you could use for testing?

I can try a build on Testing with 5.15 kernel for my system, but it will take me some time... Currently very limited spare time for tinkering :frowning: .

The device looks not stable for me.

I flashed the OpenWrt 23.05.0 and ran some VPN software on it. But after a while, the device hang, no output from TTL and I have to restart it.

I used echo "/root/core_dump" > /proc/sys/kernel/core_pattern to try to save core_dump file but it didn't work. I plan to use OpenLogger to save kernel panic log if it exists.

Is there any tips on debugging this device?

You are probably having RCU stalls:
https://forum.openwrt.org/t/solidrun-clearfog-cn9130-pro/133271/17?u=a8040

I got following error log by openlogger(nice tool):

[23615.271796] mmc1: Timeout waiting for hardware cmd interrupt.
[23615.277588] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[23615.284062] mmc1: sdhci: Sys addr:  0x00000010 | Version:  0x00000002
[23615.290540] mmc1: sdhci: Blk size:  0x00007200 | Blk cnt:  0x00000000
[23615.297016] mmc1: sdhci: Argument:  0xaaaa0000 | Trn mode: 0x00000033
[23615.303491] mmc1: sdhci: Present:   0x01ff0000 | Host ctl: 0x0000001f
[23615.309966] mmc1: sdhci: Power:     0x0000000f | Blk gap:  0x00000000
[23615.316441] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000407
[23615.322916] mmc1: sdhci: Timeout:   0x00000009 | Int stat: 0x00000001
[23615.329389] mmc1: sdhci: Int enab:  0x03ff000b | Sig enab: 0x03ff000b
[23615.335863] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[23615.342337] mmc1: sdhci: Caps:      0x35ee0099 | Caps_1:   0x0000af77
[23615.348812] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
[23615.355285] mmc1: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x003b377f
[23615.361759] mmc1: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x00000900
[23615.368233] mmc1: sdhci: Host ctl2: 0x00000000
[23615.372700] mmc1: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x0000000100d30218
[23615.379873] mmc1: sdhci: ============================================
[23625.521051] mmc1: Timeout waiting for hardware cmd interrupt.
[23625.526834] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[23625.533308] mmc1: sdhci: Sys addr:  0x00000010 | Version:  0x00000002
[23625.539785] mmc1: sdhci: Blk size:  0x00007200 | Blk cnt:  0x00000000
[23625.546260] mmc1: sdhci: Argument:  0xaaaa0000 | Trn mode: 0x00000033
[23625.552735] mmc1: sdhci: Present:   0x01ff0000 | Host ctl: 0x0000001f
[23625.559210] mmc1: sdhci: Power:     0x0000000f | Blk gap:  0x00000000
[23625.565683] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000407
[23625.572157] mmc1: sdhci: Timeout:   0x00000009 | Int stat: 0x00000001
[23625.578631] mmc1: sdhci: Int enab:  0x03ff000b | Sig enab: 0x03ff000b
[23625.585105] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
[23625.591579] mmc1: sdhci: Caps:      0x35ee0099 | Caps_1:   0x0000af77
[23625.598053] mmc1: sdhci: Cmd:       0x00000d1a | Max curr: 0x00000000
[23625.604525] mmc1: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x003b377f
[23625.610999] mmc1: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x00000900
[23625.617472] mmc1: sdhci: Host ctl2: 0x00000000
[23625.621939] mmc1: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x0000000100d30218
[23646.135023] mmc1: card aaaa removed

So I changed to another SD card and now it looks stable. (I can't use onboard EMMC since no guide for it)

Util now, I didn't get kernle error like "rcu: INFO: rcu_sched detected stalls on CPUs/tasks". So my issue is likely unrelated.