Adding support for VRX518 (and maybe VRX320)

Hey mate,
I'm trying to overclock my 7530 too. But your link to the Patch is not working anymore.

https://github.com/usakicha/openwrt/blob/main/999-ipq40xx-unlock-cpu-frequency.patch

thx man and how to load this file onto the router?

That is a patch so you need to build your own firmware images and add that patch to the built set-up-

Ahh okay :thinking:

decided to try a new build, looks like vlans work fine now with the 6.6 kernel so that's cool

unrelated observations:
My build did have btrfs support included but I pulled it when I noticed that for some reason there was a file size limit (2G+ files can't be accessed ?). I included f2fs with compression support instead incase.
Also decided to try the paragon ntfs driver instead of fuse/ntfs3g. The paragon driver definitely wasn't good for a while, if you wrote more than one file at a time... ehhh.
I personally don't use the wireless on mine and do enable a fair amount of stuff so I switched to the smallbuffers driver though it's probably not necessary.

Did you build with the "PCIe magic hack" Patch (997-pcie-qcom-host-magic.patch), to fix the pppoe bug?
If you did so do you mind to share your modified patch file as I struggle a bit to apply it to my build, with openwrt 24.10.

yeah that one is an easy 'git merge' with https://github.com/Ansuel/openwrt/tree/ipq4019-hack-lantiq

I guess this is the patch tho https://github.com/Ansuel/openwrt/commit/d8fabd44aacb34669475ff841e623f753a52f9d2.patch

Separate note: it seems like using bridge filtering to do vlans just isn't super reliable still, but making vlan interfaces off individual ports and then putting them in separate bridges works fine

Unfortunately, the code in that branch does not work. See the comments in the pull request: https://github.com/openwrt/openwrt/pull/15421

Ah well, doesn't effect any device I have but I assume it pretty much keeps people with those devices on 5.15.x for now

I went back to 23.05 branch and 5.15 kernel. link
Added the 1508 mtu dsl patch (so you can set the mtu of the dsl interface to 1508 and then the mtu of the pppoe interface to 1500 and then not have to use the mss fix (if the other side supports it)) and openvpn with dco.

Do you mind sharing a link to this patch?
As from janh's info (i might have overread this before) i also stick to 23.05.5.

But an other question here: have someone an idea to lower the high cpu with ipv4 traffic. I guess it due to the ds-lite tunnel?

root@AVM7520:~# speedtest-netperf.sh -H "netperf-EU.bufferbloat.net"
2024-11-25 17:43:44 Starting speedtest for 60 seconds per transfer session.
Measure speed to netperf-EU.bufferbloat.net (IPv4) while pinging gstatic.com.
Download and upload sessions are sequential, each with 5 simultaneous streams.
............................................................
 Download:  98.58 Mbps
  Latency: [in msec, 61 pings, 0.00% packet loss]
      Min:   7.971
    10pct:  55.430
   Median:  69.711
      Avg:  70.629
    90pct:  87.245
      Max: 102.222
 CPU Load: [in % busy (avg +/- std dev) @ avg frequency, 55 samples]
     **cpu0:  96.6 +/-  0.0  @  896 MHz**
     cpu1:  23.6 +/-  5.2  @  896 MHz
     cpu2:  25.2 +/-  4.8  @  896 MHz
     cpu3:  24.2 +/-  5.2  @  896 MHz
 Overhead: [in % used of total CPU available]
  netperf:  20.1

just take :
https://git.openwrt.org/?p=openwrt/openwrt.git;a=commit;h=ca53f2d430ce1f9ff5a560291de1e93380963417
from main and put it in 23.05 tree

Here is an updated magic patch for the current main branch: https://github.com/janh/openwrt/commit/59f52125178146a9dc44290c11c63e5e029e8044

I have confirmed that the PCIe vendor ID is changed and that the memory region containing the magic value is visible to the modem. So it should most likely fix the issue on affected devices, but I can't test that myself.

From your displayed statistics you need to find a way to move load off cpu0 onto other cpus. First things that come to mind are managing interrupt assignment (either by manual pinning or using irqbalance) and testing packet steering. Software flow offload probably won't help but you could try it too.

irq affinity script (for any branch using DSA):

diff --git a/target/linux/ipq40xx/base-files/etc/init.d/set-irq-affinity b/target/linux/ipq40xx/base-files/etc/init.d/set-irq-affinity
new file mode 100755
index 0000000000..1e1a22cebf
--- /dev/null
+++ b/target/linux/ipq40xx/base-files/etc/init.d/set-irq-affinity
@@ -0,0 +1,20 @@
+#!/bin/sh /etc/rc.common
+
+START=99
+
+start() {
+        local mask=4
+        for irq in $(grep -F ath10k_ahb /proc/interrupts | cut -d: -f1 | sed 's, *,,')
+        do
+                echo "$mask" > "/proc/irq/$irq/smp_affinity"
+                [ $mask = 4 ] && mask=8
+        done
+
+        mask=1
+        for irq in $(grep -F c080000.ethernet /proc/interrupts | cut -d: -f1 | sed 's, *,,')
+        do
+                echo "$mask" > "/proc/irq/$irq/smp_affinity"
+                mask=$((mask << 1))
+                [ $mask = 16 ] && mask=1
+        done
+}

boost cpu clock patch (for 5.15 kernel / 23.05):

diff --git a/target/linux/ipq40xx/patches-5.15/9991-ipq40xx-improve_cpu_and_nand_clock.patch b/target/linux/ipq40xx/patches-5.15/9991-ipq40xx-improve_cpu_and_nand_clock.patch
new file mode 100644
index 0000000000..cb6d1a9fcd
--- /dev/null
+++ b/target/linux/ipq40xx/patches-5.15/9991-ipq40xx-improve_cpu_and_nand_clock.patch
@@ -0,0 +1,152 @@
+From: Oever González <software@notengobattery.com>
+Subject: [PATCH] ipq40xx: improve CPU and NAND clock
+Date: Fri, 6 Mar 2020 21:22:44 -0600
+
+This patch will match the values in the device tree for those found inside the
+OEM device tree and kernel source code and unlock all of the CPU operating
+points.
+
+Also, it will set the SPI NAND (the firmware memory) clock to 48MHz, which is
+the maximum for the SPI clock. The NAND chip itself supports up to 104MHz.
+
+Signed-off-by: Oever González <software@notengobattery.com>
+---
+--- a/arch/arm/boot/dts/qcom-ipq4018-ea6350v3.dts
++++ b/arch/arm/boot/dts/qcom-ipq4018-ea6350v3.dts
+@@ -244,7 +244,7 @@
+               status = "okay";
+               compatible = "spi-nand";
+               reg = <1>;
+-              spi-max-frequency = <24000000>;
++              spi-max-frequency = <48000000>;
+ 
+               partitions {
+                       compatible = "fixed-partitions";
+--- a/arch/arm/boot/dts/qcom-ipq4019.dtsi
++++ b/arch/arm/boot/dts/qcom-ipq4019.dtsi
+@@ -55,7 +55,7 @@
+                       reg = <0x0>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -69,7 +69,7 @@
+                       reg = <0x1>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -83,7 +83,7 @@
+                       reg = <0x2>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -97,7 +97,7 @@
+                       reg = <0x3>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -114,20 +114,72 @@
+ 
+               opp-48000000 {
+                       opp-hz = /bits/ 64 <48000000>;
+-                      clock-latency-ns = <256000>;
++                      clock-latency-ns = <100000>;
+               };
+               opp-200000000 {
+                       opp-hz = /bits/ 64 <200000000>;
+-                      clock-latency-ns = <256000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-384000000 {
++                      opp-hz = /bits/ 64 <384000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-413000000 {
++                      opp-hz = /bits/ 64 <413000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-448000000 {
++                      opp-hz = /bits/ 64 <448000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-488000000 {
++                      opp-hz = /bits/ 64 <488000000>;
++                      clock-latency-ns = <100000>;
+               };
+               opp-500000000 {
+                       opp-hz = /bits/ 64 <500000000>;
+-                      clock-latency-ns = <256000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-512000000 {
++                      opp-hz = /bits/ 64 <512000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-537000000 {
++                      opp-hz = /bits/ 64 <537000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-565000000 {
++                      opp-hz = /bits/ 64 <565000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-597000000 {
++                      opp-hz = /bits/ 64 <597000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-632000000 {
++                      opp-hz = /bits/ 64 <632000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-672000000 {
++                      opp-hz = /bits/ 64 <672000000>;
++                      clock-latency-ns = <100000>;
+               };
+               opp-716000000 {
+                       opp-hz = /bits/ 64 <716000000>;
+-                      clock-latency-ns = <256000>;
+-              };
++                      clock-latency-ns = <100000>;
++              };
++              opp-768000000 {
++                      opp-hz = /bits/ 64 <768000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-823000000 {
++                      opp-hz = /bits/ 64 <823000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-896000000 {
++                      opp-hz = /bits/ 64 <896000000>;
++                      clock-latency-ns = <100000>;
++              };
+       };
+ 
+       memory {
+--- a/drivers/clk/qcom/gcc-ipq4019.c
++++ b/drivers/clk/qcom/gcc-ipq4019.c
+@@ -579,6 +579,9 @@ static const struct freq_tbl ftbl_gcc_apps_clk[] = {
+       F(632000000, P_DDRPLLAPSS, 1, 0, 0),
+       F(672000000, P_DDRPLLAPSS, 1, 0, 0),
+       F(716000000, P_DDRPLLAPSS, 1, 0, 0),
++      F(768000000, P_DDRPLLAPSS, 1, 0, 0),
++      F(823000000, P_DDRPLLAPSS, 1, 0, 0),
++      F(896000000, P_DDRPLLAPSS, 1, 0, 0),
+       { }
+ };
+ 

cpu boost patch (for 6.6 kernel / 24 / mainline):

diff --git a/target/linux/ipq40xx/patches-6.6/9991-ipq40xx-improve_cpu_and_nand_clock.patch b/target/linux/ipq40xx/patches-6.6/9991-ipq40xx-improve_cpu_and_nand_clock.patch
new file mode 100644
index 0000000000..1d12013a4c
--- /dev/null
+++ b/target/linux/ipq40xx/patches-6.6/9991-ipq40xx-improve_cpu_and_nand_clock.patch
@@ -0,0 +1,152 @@
+From: Oever González <software@notengobattery.com>
+Subject: [PATCH] ipq40xx: improve CPU and NAND clock
+Date: Fri, 6 Mar 2020 21:22:44 -0600
+
+This patch will match the values in the device tree for those found inside the
+OEM device tree and kernel source code and unlock all of the CPU operating
+points.
+
+Also, it will set the SPI NAND (the firmware memory) clock to 48MHz, which is
+the maximum for the SPI clock. The NAND chip itself supports up to 104MHz.
+
+Signed-off-by: Oever González <software@notengobattery.com>
+---
+--- a/arch/arm/boot/dts/qcom/qcom-ipq4018-ea6350v3.dts
++++ b/arch/arm/boot/dts/qcom/qcom-ipq4018-ea6350v3.dts
+@@ -244,7 +244,7 @@
+               status = "okay";
+               compatible = "spi-nand";
+               reg = <1>;
+-              spi-max-frequency = <24000000>;
++              spi-max-frequency = <48000000>;
+ 
+               partitions {
+                       compatible = "fixed-partitions";
+--- a/arch/arm/boot/dts/qcom/qcom-ipq4019.dtsi
++++ b/arch/arm/boot/dts/qcom/qcom-ipq4019.dtsi
+@@ -55,7 +55,7 @@
+                       reg = <0x0>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -69,7 +69,7 @@
+                       reg = <0x1>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -83,7 +83,7 @@
+                       reg = <0x2>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -97,7 +97,7 @@
+                       reg = <0x3>;
+                       clocks = <&gcc GCC_APPS_CLK_SRC>;
+                       clock-frequency = <0>;
+-                      clock-latency = <256000>;
++                      clock-latency = <100000>;
+                       operating-points-v2 = <&cpu0_opp_table>;
+               };
+ 
+@@ -114,20 +114,72 @@
+ 
+               opp-48000000 {
+                       opp-hz = /bits/ 64 <48000000>;
+-                      clock-latency-ns = <256000>;
++                      clock-latency-ns = <100000>;
+               };
+               opp-200000000 {
+                       opp-hz = /bits/ 64 <200000000>;
+-                      clock-latency-ns = <256000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-384000000 {
++                      opp-hz = /bits/ 64 <384000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-413000000 {
++                      opp-hz = /bits/ 64 <413000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-448000000 {
++                      opp-hz = /bits/ 64 <448000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-488000000 {
++                      opp-hz = /bits/ 64 <488000000>;
++                      clock-latency-ns = <100000>;
+               };
+               opp-500000000 {
+                       opp-hz = /bits/ 64 <500000000>;
+-                      clock-latency-ns = <256000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-512000000 {
++                      opp-hz = /bits/ 64 <512000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-537000000 {
++                      opp-hz = /bits/ 64 <537000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-565000000 {
++                      opp-hz = /bits/ 64 <565000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-597000000 {
++                      opp-hz = /bits/ 64 <597000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-632000000 {
++                      opp-hz = /bits/ 64 <632000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-672000000 {
++                      opp-hz = /bits/ 64 <672000000>;
++                      clock-latency-ns = <100000>;
+               };
+               opp-716000000 {
+                       opp-hz = /bits/ 64 <716000000>;
+-                      clock-latency-ns = <256000>;
+-              };
++                      clock-latency-ns = <100000>;
++              };
++              opp-768000000 {
++                      opp-hz = /bits/ 64 <768000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-823000000 {
++                      opp-hz = /bits/ 64 <823000000>;
++                      clock-latency-ns = <100000>;
++              };
++              opp-896000000 {
++                      opp-hz = /bits/ 64 <896000000>;
++                      clock-latency-ns = <100000>;
++              };
+       };
+ 
+       memory {
+--- a/drivers/clk/qcom/gcc-ipq4019.c
++++ b/drivers/clk/qcom/gcc-ipq4019.c
+@@ -579,6 +579,9 @@ static const struct freq_tbl ftbl_gcc_apps_clk[] = {
+       F(632000000, P_DDRPLLAPSS, 1, 0, 0),
+       F(672000000, P_DDRPLLAPSS, 1, 0, 0),
+       F(716000000, P_DDRPLLAPSS, 1, 0, 0),
++      F(768000000, P_DDRPLLAPSS, 1, 0, 0),
++      F(823000000, P_DDRPLLAPSS, 1, 0, 0),
++      F(896000000, P_DDRPLLAPSS, 1, 0, 0),
+       { }
+ };
+ 

Thank for you both for the input.
Unfortunately I allready add the cpu clock patch and use irqbalance.
I guess there is no really good way to speed up dslite, but I have to life an other year with that.

You will need to understand what exactly is generating each CPU's load - installing htop might help with that.

While irqbalance tries to balance IRQ load across CPUs I don't think it tries to actually balance CPU load, so if cpu0 is actually carrying IRQ load as well as other process load it may require manual irq pinning (though not all irqs can be shifted off cpu0 :frowning_face:). At the very least I'd be trying to shift all movable irqs to CPUs 1-3, which is a bit different to what wilsonyan's script does.

maybe even wireguard is faster, it will spread over cpus and its not like you can't use it to tunnel ipv4 over ipv6

It's not even faster, its way faster the the ip6tunnel/DS-lite. No core goes over 20% load at 100mbit.

I'will look into this as well, but i think may be it could work to get from 99 to 90% :-).
So, that is the way I'm heading and I'm already tried to tunnel ipv4 through a VPS. But here i have a higher ping and some homepages block traffic from all my favourite hosters (netcup, hetzner, OCI) :neutral_face: .
Maybee i just need to consider to ditch the concept of an DSL-Router and switch to a Modem+Router.