Last question: have you tried to change the policier from wrr to sp?
Do you by any chance enabled HQoS?
HW flow offload should have sent packets in the order it’s received. It’s odd that traffic from one UDP stream is sent out of order by the PPE.
My GL-MT6000 is running vanilla OpenWrt master builds and I have HW offload and WED enabled. I have to say that I’m not affected by HW offload at all, at least I’m not aware if it occurs. Have been using WiFi Calling for my iPhones without issues.
you mean for HQoS? With out without same.
The out-of-order packets happens on a standard snapshot without any modification. this issue have been around for months and i suspect since mtk_ppe is used.
at least now I have found a "work around" which is to disable udp hw offload.
the udp out-of-order packet problem does not affect most people.
If you want to check if you have the issue. just do a
iperf3 -c "host-on-different-subnet" -R -u -b 500M
I am 99.99% sure you have the same udp out-of-order packet issue. It's just that you don't have a voip system or games that is affected by the issue. As such most people don't even realize the problem exist.
link to udp out-of-order issues and daniel's reply. No solution was ever provided then. it was flagged on bpi-r4 thread. But I have also tested this on mt6000. and I believe afffect all mt7986 and mt7988 and potentially mt7981 as well.
7621 too, RT-AX54
Hi all! It's been a while I've posted. I hope you're all having a good time.
I tried to catch up and I see there's a lot of new updates, especially about the Wi-Fi! (Which has been great so far to be honest)
I've been runing r4.3.6 for some time - stable as a rock, I couldn't be happier.
What's the next "stable" build you recommend installing?
These builds are targeting main
I guess, so I will get the new APK package manager: is there anything I should do about this change? Is the migration seamless?
Thanks to pesa again for the amazing work you are doing!
Yup, I see it now:
% iperf3-darwin -c iperf3.moji.fr -R -u -b 500M
Connecting to host iperf3.moji.fr, port 5201
Reverse mode, remote host iperf3.moji.fr is sending
[ 7] local 2001:e68:5433:231f:b171:f2cc:33a2:cebe port 55729 connected to 2a06:c484:6::3:1 port 5201
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 7] 0.00-1.00 sec 54.3 MBytes 455 Mbits/sec 0.007 ms 3877/43977 (8.8%)
[ 7] 1.00-2.00 sec 54.4 MBytes 456 Mbits/sec 0.007 ms 3981/44151 (9%)
[ 7] 2.00-3.00 sec 55.9 MBytes 468 Mbits/sec 0.012 ms 2895/44140 (6.6%)
[ 7] 3.00-4.00 sec 55.1 MBytes 462 Mbits/sec 0.020 ms 3157/43872 (7.2%)
[ 7] 4.00-5.00 sec 53.6 MBytes 450 Mbits/sec 0.008 ms 4558/44161 (10%)
[ 7] 5.00-6.00 sec 55.4 MBytes 465 Mbits/sec 0.073 ms 2798/43702 (6.4%)
[ 7] 6.00-7.00 sec 53.5 MBytes 449 Mbits/sec 0.104 ms 4580/44112 (10%)
[ 7] 7.00-8.00 sec 51.3 MBytes 431 Mbits/sec 0.045 ms 6083/43998 (14%)
[ 7] 8.00-9.00 sec 54.5 MBytes 457 Mbits/sec 0.008 ms 3959/44203 (9%)
[ 7] 9.00-10.00 sec 54.7 MBytes 459 Mbits/sec 0.007 ms 3594/44008 (8.2%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 7] 0.00-10.21 sec 609 MBytes 500 Mbits/sec 0.000 ms 0/449441 (0%) sender
[SUM] 0.0-10.2 sec 10 datagrams received out-of-order
[ 7] 0.00-10.00 sec 543 MBytes 455 Mbits/sec 0.007 ms 39482/440324 (9%) receiver
iperf Done.
Above is done with a Mac connected via WiFi.
I have to say it doesn't affect me tho.
Interesting.
Edit:
Tho. in this case, I'm not sure if it is due to the route from source to my MT6000 tho.
i have tested the same issue by testing udp forwarding between different subnets on my own internal network. the simplest way to flag this issue. As long as hw offload is enabled. udp forwarding will result in out-of-order packets. plain and simple.
you can test on a host on a different subnet in your network.
alternatively, switch to sw offload or disable offload and test with the same host that you just tested.
next stable will be 4.5.6.rss.mtk it's on testing...
Can you open a bug on openwrt github ? --> https://github.com/openwrt/openwrt/issues
Thanks
I rather not. imho key dev has been make known of the problem. And I have over the last few months gone through multiple cycles of "are you using unmodified snapshot", "it does not affect me" and "udp delivery and sequence is not guaranteed" loop so many times. And when problem is truly identified, everyone keeps quiet.
but thanks to @romanovj and @brada4, we have a way to disable udp hw offload to circumvent the issues.
Anyone do feel free to do so.
Edit: i think I will leave this issues as it. if anyone come out with ideas to test and troubleshoot, I am all in.
cheers
I switched to software offload and rebooted my MT6000. I still see out of order packet tho. when testing with the same Internet iperf3 server.
So for my case, out of order packets should be expected for Internet traffic as they may take different route and some packets may be routed via another path if intermediate router(s) detected congestion.
I turned off software offload as well but this time without rebooting my router and tried again. I still see OoO packets even without any form of offloads.
All these tests are done with WED enabled tho.
I don't think my data point will be useful in this case. Really have to test router on a local network to remove all variability.
Not sure if we're encountering a Linux network bug tho.
Interesting nonetheless.
test on wired connection. WED adds tons of complication to the process.
On another note. you may have uncovered addition issues. I am not an informed party on WED. If you really want to test to be sure. disable WED and do another test.
Ok. So I took out my Linksys E8450 and set it up as closely as I could to my MT6000. MT6000 is my main Internet router now so I'll leave it alone for now
Tested iperf3 via LAN to WAN ethernet link, with the WAN port connected to my Mac Mini M1, and LAN client using a Dell Latitude notebook. Dell notebook acting as iperf3 client and Mac Mini as iperf3 server.
I'm still seeing OoO packet even with software offload and no offload. Each time rebooting the E8450 between changes. Even turned off packet steering.
What I saw was that OoO is observed when server is pushing out packets from 400Mbps or greater. And without offload, I see more OoO packets compared to without.
And I see more OoO when server is asked to pushed higher bandwidth, e.g. -b 600M
or greater.
In my case, it looks to me like Linux kernel network stack is the issue here. Maybe that on top of the slower SoC CPU performance. Probably not an issue with x86-64 CPUs running the routing.
Edit:
Both my E8450 and MT6000 are running the same master tree pull build, running kernel v6.6.68.
sorry. best i can suggest is to eliminate external influence.
Set up a iperf3 host on another subset on your internel network.
you are introducing too many variables all at the same time.
imho not the way i troubleshoot
just for kicks. udp hw offload disabled.
iperf3 -c iperf3.moji.fr -R -u -b 500M
Connecting to host iperf3.moji.fr, port 5201
Reverse mode, remote host iperf3.moji.fr is sending
[ 5] local 192.168.8.241 port 48214 connected to 45.147.210.189 port 5201
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 59.6 MBytes 500 Mbits/sec 0.010 ms 0/43181 (0%)
[ 5] 1.00-2.00 sec 59.6 MBytes 500 Mbits/sec 0.015 ms 0/43127 (0%)
[ 5] 2.00-3.00 sec 59.6 MBytes 500 Mbits/sec 0.012 ms 0/43195 (0%)
[ 5] 3.00-4.00 sec 59.6 MBytes 500 Mbits/sec 0.012 ms 0/43158 (0%)
[ 5] 4.00-5.00 sec 59.6 MBytes 500 Mbits/sec 0.011 ms 0/43163 (0%)
[ 5] 5.00-6.00 sec 59.6 MBytes 500 Mbits/sec 0.013 ms 0/43165 (0%)
[ 5] 6.00-7.00 sec 59.6 MBytes 500 Mbits/sec 0.009 ms 0/43162 (0%)
[ 5] 7.00-8.00 sec 59.6 MBytes 500 Mbits/sec 0.011 ms 0/43125 (0%)
[ 5] 8.00-9.00 sec 59.7 MBytes 500 Mbits/sec 0.052 ms 0/43207 (0%)
[ 5] 9.00-10.00 sec 59.6 MBytes 500 Mbits/sec 0.048 ms 0/43166 (0%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.19 sec 607 MBytes 500 Mbits/sec 0.000 ms 0/0 (0%) sender
[ 5] 0.00-10.00 sec 596 MBytes 500 Mbits/sec 0.048 ms 0/431649 (0%) receiver
iperf Done.
below with default hw offload enabled for both tcp and udp
iperf3 -c iperf3.moji.fr -R -u -b 500M
Connecting to host iperf3.moji.fr, port 5201
Reverse mode, remote host iperf3.moji.fr is sending
[ 5] local 192.168.8.241 port 39917 connected to 45.147.210.189 port 5201
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 59.7 MBytes 500 Mbits/sec 0.016 ms 0/43237 (0%)
[ 5] 1.00-2.00 sec 59.5 MBytes 499 Mbits/sec 0.009 ms 31/43144 (0.072%)
[ 5] 2.00-3.00 sec 59.6 MBytes 500 Mbits/sec 0.029 ms 0/43164 (0%)
[ 5] 3.00-4.00 sec 59.6 MBytes 500 Mbits/sec 0.017 ms 0/43179 (0%)
[ 5] 4.00-5.00 sec 59.6 MBytes 500 Mbits/sec 0.028 ms 0/43145 (0%)
[ 5] 5.00-6.00 sec 59.6 MBytes 500 Mbits/sec 0.030 ms 0/43179 (0%)
[ 5] 6.00-7.00 sec 59.6 MBytes 500 Mbits/sec 0.028 ms 0/43148 (0%)
[ 5] 7.00-8.00 sec 59.6 MBytes 500 Mbits/sec 0.017 ms 0/43182 (0%)
[ 5] 8.00-9.00 sec 59.6 MBytes 500 Mbits/sec 0.034 ms 0/43145 (0%)
[ 5] 9.00-10.00 sec 59.6 MBytes 500 Mbits/sec 0.023 ms 0/43180 (0%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.19 sec 607 MBytes 500 Mbits/sec 0.000 ms 0/0 (0%) sender
[SUM] 0.0-10.2 sec 36 datagrams received out-of-order
[ 5] 0.00-10.00 sec 596 MBytes 500 Mbits/sec 0.023 ms 31/431703 (0.0072%) receiver
iperf Done.
Actually, my test environment is as clean as it gets.
Dell (iperf client) (192.168.1.227) <-Ethernet-> E8450 (LAN IP - 192.168.1.1/24, WAN IP - 192.168.11.2/24) <-Ethernet-> Mac Mini (iperf server) (192.168.11.1/24)
My home network is on a totally different subnet. E8450 is not connected to my home network.
I don't know why you think there's anything external or additional variables affecting the results.
Test done with offloads (s/w & h/w) turned on and off with reboots of E8450 with each tests.
Still see OoO UDP packets.
sorry no idea,
Apologies if I come across as rude, which I'm not trying to ...
What I'm trying to say is that we could be seeing a Linux kernel bug, instead of with the Mediatek H/W offload engine.
In any case, this is a good exercise to re-build my Linux networking memory muscles. Heh heh. Time to dig into kernel networking codes again.
Quick update on the packets OoO issue.
I took out my ipq806x Askey RT4230W Rev6 router, compiled a build with the same source tree as my E8450 (i.e. with kernel v6.6.68) and did the LAN-WAN iperf3 test again. The idea is to see if I can see the same OoO packets.
Well, I did not see any OoO packets.
Unfortunately the Askey router doesn't have enough grunt to go past 580Mbps. I could only test up to that limit (with software flow offload turned on.)
With this, I conclude that we are not seeing a Linux network stack bug. Most likely this is a mt7530/1 switch driver bug for the MT6000/E8450.
Think I'll start with the switch driver. A lot easier. Heh heh.
Another interesting observation is that when I run iperf3 without the reverse flag (i.e. without '-R', which means data flowing from client to server) I see packets OoO at the server end for both E8450 and Askey RT4230W. I wasn't expecting that.
More head scatching ... hmmm ...