[PR] Ipq806x: kernel 5.4 bump code propose

If someone can review this and put some comments, here is the pr.


̶H̶e̶l̶l̶o̶,̶ ̶i̶ ̶h̶a̶v̶e̶ ̶s̶t̶a̶r̶t̶e̶d̶ ̶p̶o̶r̶t̶i̶n̶g̶ ̶k̶e̶r̶n̶e̶l̶ ̶5̶.̶4̶ ̶t̶o̶ ̶i̶p̶q̶8̶0̶6̶x̶ ̶t̶a̶r̶g̶e̶t̶ ̶b̶u̶t̶ ̶i̶ ̶c̶a̶m̶e̶ ̶a̶c̶r̶o̶s̶s̶ ̶a̶ ̶v̶e̶r̶y̶ ̶b̶i̶g̶ ̶p̶r̶o̶b̶l̶e̶m̶ ̶t̶h̶a̶t̶ ̶i̶ ̶c̶a̶n̶'̶t̶ ̶r̶e̶a̶l̶l̶y̶ ̶m̶a̶n̶a̶g̶e̶ ̶t̶o̶ ̶s̶o̶l̶v̶e̶.̶ ̶F̶o̶r̶ ̶s̶o̶m̶e̶ ̶r̶e̶a̶s̶o̶n̶ ̶o̶n̶ ̶w̶i̶t̶h̶ ̶m̶y̶ ̶t̶e̶s̶t̶ ̶i̶m̶a̶g̶e̶,̶ ̶t̶h̶e̶r̶e̶ ̶i̶s̶ ̶n̶o̶t̶ ̶w̶i̶r̶e̶d̶ ̶c̶o̶n̶n̶e̶c̶t̶i̶o̶n̶.̶ ̶......

1 Like

I'll need some time to get set up and oriented but here are a few initial thoughts.

You've mentioned that you've tried both the "dsa" and the "normal" (switchdev?) driver. Unless dsa is going to be the new default in 5.4 for these devices, I'd like to stay with switchdev - at least for troubleshooting this issue. I'm willing to experiment with dsa but it will take time. I'm not sure how your repo is set up in this regard, but I'm sure I'll find out.

Also you mention some kernel flags need to be added... is this done? I bring it up as I saw this which looks like a bunch of kernel flags related to dsa.

Assuming I can get a functioning build on the r7500v2 and reproduce the symptoms, my thought is it to take a look for changes around netfilter (or other firewall related code in the kernel).

HTH

I asked other devs if they have some problem with 5.4 and they said that all works good. So i think is something target specific.
I tried dsa since it should be the upstream (and more compatible) version of the driver. To use dsa driver some changes needs to be done to the dts of the device (if you want i can post them here...)
Dsa flags are already enabled but since there is no support in the dts they don't get compiled and included in the final kernel.

About the kernel flags, i added some of them but leave some out as i think they should be included in the generic config file and not in the ipq one... (on compilation time just press enter on all of them as they are just N by default, and also should not be related to this problem, they should cause at most an increased kernel)

Hm, on IPQ40xx there is also problem with networking.
It works for a couple of seconds and then, kernel will panic with dst_release underflow

So, I wouldnt say that 5.4 is perfect in the current state.

mhh yesterday i notice a kernel panic with the switch driver... does ipq40xx and ipq806x share the same driver (i think yes)

No, IPQ40xx uses a modified version of QCA8337N as its register space is actually memory-mapped into SoC-s.
So it acts differently, there is a modified qca8k for IPQ40xx as DSA, and completely custom driver used currently.

Do you maybe have a stack trace for the panic?
I think that that dst_release underflow error is generic for OpenWrt under 5.4 as this can also be found by googling.

Had this exact panic some times ago but i had this with a full build. Testing with less thing doesn't trigger this so i think the cause is not the switch driver.

Its not the switch driver as its triggered by core networking drivers, its gotta be a OpenWrt related patch.
Only appers once traffic start going

Now that i think about it it should be trigged on wifi initialization (as the switch driver is broken, it's the one that start sending data). The strange thing is that other devs working on other target don't have this...

@robimarko Try remove this

650-netfilter-add-xt_OFFLOAD-target.patch

Unfortunately, its the same

it's the only change that patches do to ip6 code o.O (checked with a quick grep)

I am sure that true issue will be found as soon as 5.4 support is in relatively good shape.
Its weird that we are almost at v5.5 and OpenWrt development for v5.4 support is in this state.

actually... it's strange that we are using a really recent kernel in openwrt...

Not really, I did the migration to v4.19 for IPQ40xx and generic support for it was already done in RC releases.
I had the v4.19 PR as soon as generic support was merged, so this is rather slow compared to it.

also 647 looks VERY suspicious... If you have some time to waste try remove both...

No help, still crashes as soon as traffic starts.
Also, I will finally get a VR2600V next week, so that I finally have some IPQ806x HW.
So far I had a lot of IPQ40xx

I think that to really troubleshoot your problem you should revert/remove ANY netfilter patch and check if it does crash.

Will try it when I have time for it.
Thanks for your effort

@Ansuel It looks like others faced this issue, Koen now has updated his v5.4 WIP commit to supposedily include the fix for the DST error.
https://git.openwrt.org/?p=openwrt/staging/xback.git;a=commit;h=184234529ed862f77a7016183d96d14fa54ce05e

  • Fix included from nbd regarding "dst leak in Flow Offload"

Unfortunately, it does resolve the underflow panic