Adding support for VRX518 (and maybe VRX320)

maybe schedutil performs than ondemand in your scenario?
I switched to it, but didn't compare hard numbers, maybe give that one a try:
http://sprunge.us/7YMtq9

You were right. Switching to pppoe-wan works.

One more thing to add: I've noticed that "CPU packet steering" only seems to work sometimes (don't know exactly when). When I assign the SMP affinity manually I'm getting reliable results. Right now I'm doing it like this:

for i in /proc/irq/[0-9]*/edma_eth* ; do echo e > `dirname $i`/smp_affinity ; done

Performance is much better when I'm moving ethernet away from CPU0.

1 Like

Mmmh, I wonder whether that will also work on xrx200, at bidirectionally saturating loads SIRQ on CPU0 seems to max out that CPU, so to get a bit more leeway would be nice. I guess this is not the optimal thread to ask... but there only seem to be two real candidates here, either the dsl parts or the ethernet parts...

root@OpenWrt:/proc/irq$ cat /proc/interrupts 
           CPU0       CPU1       
  7:  726954738  868310867      MIPS   7  timer
  8:   26687971   36637694      MIPS   0  IPI call
  9:    1638314    1429276      MIPS   1  IPI resched
 30:        762          0       icu  30  ath9k
 62:          0          0       icu  62  1e101000.usb, dwc2_hsotg:usb1
 63:  489776881          0       icu  63  mei_cpe
 72:  385694472          0       icu  72  xrx200_net_rx
 73:  678948949          0       icu  73  xrx200_net_tx
 96:  953141392          0       icu  96  ptm_mailbox_isr
112:       1381          0       icu 112  asc_tx
113:          0          0       icu 113  asc_rx
114:          0          0       icu 114  asc_err
126:          0          0       icu 126  gptu
127:          0          0       icu 127  gptu
128:          0          0       icu 128  gptu
129:          0          0       icu 129  gptu
130:          0          0       icu 130  gptu
131:          0          0       icu 131  gptu
144:   43175312          0       icu 144  ath10k_pci
161:          2          0       icu 161  ifx_pcie_rc0
ERR:          1

I guess I will try to move xrx200_net_rx and xrx200_net_tx and see whether that helps... I bridge dsl0.7 with the wan port on the switch, which I assume must be handled by xrx200_net_[r|t]x will see whether that helps or not (and post results in an xrx200 thread)

I couldn't notice any performance increases when I activate packet steering. So I've installed irqbalance with the following effect:

           CPU0       CPU1       CPU2       CPU3
 26:    2140793     786229     863447     667863     GIC-0  20 Level     arch_timer
 30:          3          0          0          0     GIC-0 270 Level     bam_dma
 31:          0          0          0          0     GIC-0 239 Level     bam_dma
 32:          2          0          0          0     GIC-0 139 Level     msm_serial0
 34:      45801          0          0          0     GIC-0 133 Level     bam_dma
 51:         20          0          0          0     GIC-0 200 Level     ath10k_ahb
 68:         21          0          0          0     GIC-0 201 Level     ath10k_ahb
 69:          6          0          0     466730     GIC-0  97 Edge      edma_eth_tx0
 70:         19          0          0          0     GIC-0  98 Edge      edma_eth_tx1
 71:          0          0          0          0     GIC-0  99 Edge      edma_eth_tx2
 72:          0          0          0          0     GIC-0 100 Edge      edma_eth_tx3
 73:          3          0          0     483618     GIC-0 101 Edge      edma_eth_tx4
 74:         20          0          0          0     GIC-0 102 Edge      edma_eth_tx5
 75:          0          0          0          0     GIC-0 103 Edge      edma_eth_tx6
 76:          0          0          0          0     GIC-0 104 Edge      edma_eth_tx7
 77:          8          0          0     854663     GIC-0 105 Edge      edma_eth_tx8
 78:         44          0          0          0     GIC-0 106 Edge      edma_eth_tx9
 79:          0          0          0          0     GIC-0 107 Edge      edma_eth_tx10
 80:          0          0          0          0     GIC-0 108 Edge      edma_eth_tx11
 81:          5          0          0     665160     GIC-0 109 Edge      edma_eth_tx12
 82:         36          0          0          0     GIC-0 110 Edge      edma_eth_tx13
 83:          0          0          0          0     GIC-0 111 Edge      edma_eth_tx14
 84:          0          0          0          0     GIC-0 112 Edge      edma_eth_tx15
 85:     393097          0          0          0     GIC-0 272 Edge      edma_eth_rx0
 87:         14          0     526154          0     GIC-0 274 Edge      edma_eth_rx2
 89:         38     284310          0          0     GIC-0 276 Edge      edma_eth_rx4
 91:         21          0          0     316021     GIC-0 278 Edge      edma_eth_rx6
102:          0          0          0          0   PCI-MSI   0 Edge      aerdrv
103:          0          0          0          0   msmgpio  42 Edge      keys
104:          0          0          0          0   msmgpio  41 Edge      keys
105:          0          0          0          0   msmgpio  43 Edge      keys
106:          0          0          0          0     GIC-0 164 Level     xhci-hcd:usb1
107:          1          0          0          0   PCI-MSI 524288 Edge      PTM SL
108:     125934          0          0          0   PCI-MSI 524289 Edge      mei_cpe
109:    1903630          0          0          0   PCI-MSI 524290 Edge      aca-txo
110:    2078391          0          0          0   PCI-MSI 524291 Edge      aca-rxo
IPI0:          0          0          0          0  CPU wakeup interrupts
IPI1:          0          0          0          0  Timer broadcast interrupts
IPI2:       1532       1607       1795       1621  Rescheduling interrupts
IPI3:     384021    2111880    1162563    1136134  Function call interrupts
IPI4:          0          0          0          0  CPU stop interrupts
IPI5:        576        175        255        236  IRQ work interrupts
IPI6:          0          0          0          0  completion interrupts

Could you post you the output of cat /proc/interrupts for comparison please?

Do you set the scaling governor to performance only or do you set the scaling_min_freq to 89600 too?

btw.:
Can I merge the vrx518 branch from Jan with the master branch of Openwrt? If yes, can somebody tell me how?
Thanks!

Mmmh, trying to set affinity for the etherent interrupt 72 to 2, so CPU1 resulted in an immediate lock out of my xrx200, and loss of access.... this does not seem to be a solution to spread out the load.... end of OT side thread

I have seen similar behavior, my device locks when moving the smp affinity. @abajk suggested to try following patch.

2 Likes

I guess I should build a version with that change and see how that works and whether it actually helps spread the load between the two cores more equitable.

Do you know why? I thought, 7530 and 7520 are identical...?

They are, there was just an alternative and simpler way, see

I updated the branch for kernel 5.15, so it works with CONFIG_TESTING_KERNEL=y now.
Only minor testing done so far, but DSL is up and running.

ok, i will try it this weekend with a 7520.
Is there some info or even quick step guide how to use the latest development/code from this thread?

I just pushed another update so the 7520 variant for u-boot is built.

With that the 7530 install steps still apply, just use the 7520 u-boot binary instead (but it'll work with the 7530 build too, but you'll get a random MAC):
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=95b0c07a618fe5fd93a26931152ced483bba143b

2 Likes

Hello also down

Does the driver here for vrx518 (vr11) also potentially include support for vrx320 (vr10)?

As noted in the first post, a previous thread Netgear D7800 build stalled on dsl support but would be curious if changes for vr11 could otherwise be ported to vr10 drivers?

Interested in VDSL support for Netgear D7800 so I can to retire my HH5.

Here is another update including fixes and cleanup. (I already started working on this some time ago, so the latest changes by @dhewg are not included. But I think the only missing thing is the 7520 u-boot image, which shouldn't matter much as that only needs to be built once anyway.)

So, here is the list of changes:

  • Support for kernel 5.15

  • Info LED is now used for DSL status

  • MAC address for vectoring error reports is actually set (commands in the autoboot script require a nLine parameter with current drivers)

  • DSL connection can be started and stopped via /etc/init.d/dsl_control (Previously, there was no way to get the connection running again after the service was stopped unless kernel modules are reloaded. This is because the drv_dsl_cpe_api driver is broken and fails to restart the autoboot thread once it has been stopped. This thread is needed to start the DSL connection. The daemon now avoids stopping the autoboot thread when it exits.)

  • Reported connection uptime is now accurate (this is a long-standing bug in the driver which also affected older modem generations, caused by cumulative rounding error)

  • Cleaned up data path code (includes removing debug code and a few other changes)

  • Fixed buffer management in transmit path (The previous code in sw_plat.c didn't consider that both the buffers allocated for internal transmit descriptors in ptm_tc.c/atm_tc.c and those allocated in sw_plat.c for sending move between the respective descriptors during operation. This resulted in a segmentation fault when unloading the vrx518_tc module, as sw_plat.c freed all buffers which it had previously allocated itself, but at this point some of these would be in the internal descriptors and already freed by ptm_tc.c/atm_tc.c.)

  • Moved all datapath patches from the WIP commit into the respective driver commits

  • Experimental NAPI support for data path (Seems to work so far, but hasn't seen much testing. Optionally it is possible to switch to threaded NAPI via /sys/class/net/dsl0/threaded. If you want to try a version without NAPI support, you can remove the two uppermost patches 202-napi.patch and 203-dbg.patch from the vrx518_tc driver.)

The updated code is in my vrx518 branch.

Some of the remaining issues in functionality that I know of:

  • ADSL/ATM support. Not sure if anyone has already tested this, I don't have any way to do that myself.

  • Reports of broken transmit path on some devices (message dc_ep_clk_on failed in kernel log)

  • Switching between different firmware requires driver reload, unlike VR9 where /etc/init.d/dsl_control restart works (but this is a relatively minor issue)

And if this is ever going to be upstreamed in the future, there is also the issue of firmware licensing. For the DSL firmware, there is at least a license file (it looks like a standard license for device manufacturers, which may not be sufficient for OpenWrt). For the ACA and PPE firmware the situation is even worse, as there doesn't seem to be any license information at all. Of course, one could add a firmware downloader package similar to the vectoring firmware installer for VR9, but for VR11 that would mean an additional download is required even for basic operation of the modem.

2 Likes

Sounds like nice progress, I'll test drive it!

On first glance our 5.15 patches differ a little too. The only small thing worth mentioning is about ltq-vdsl-vr11-mei: There's another MODULE_SUPPORTED_DEVICE in src/test_internal/drv_test_mei_cpe_linux.c. I assume that get's compiled with the additional test package, but I didn't even try that.

I have to admit I only fixed whatever errors came up when building, so I didn't notice that. But I just did a build of the ltq-vdsl-vr11-mei-test package and it worked. It looks like that only builds a test application (src/mei_cpe_drv_test.c), and the kernel module within test_internal isn't built at all.

2 Likes

Ah okay, so that doesn't matter after all :wink: I also just looked at the errors and grep'ed around a little.

As for your latest update:
There wasn't any noticeable regression with non-threaded NAPI, I tested this for +2 days. No issues, no dmesg entries afaict.
I enabled threaded NAPI yesterday, seems to work just fine too, no regressions either :wink:

1 Like

I just pushed another small update to my branch. This includes just 2 changes:

  • Moved locking for interrupt mask changes to EP driver (cleans up the code a bit).

  • Fixed reported upstream MINEFTR value (Removed multiplication by 1000 in the driver which caused an overflow, as the EFTR_min value from the device seems to alredy be in bits/second. The same issue exists for VR9, so a separate patch for these devices is also needed. I'm not entirely sure if this is actually an issue in the Lantiq modem. In theory, this could also be caused by the other end sending data in the wrong format. But the only G.INP line I have for testing is my own DTAG line (Broadcom 194.26). So, if anyone tests this on a line with different vendor ID, please report back with results. The MINEFTR value is reported by dsl_cpe_pipe.sh pmrtctg 1.)

2 Likes

@dhewg It would be great if you could rebase your branch on the current master the next time you work on it: Support for the Fritz!7530 with the new NAND chip was just merged.

Thanks for all the hard work!

1 Like