I’m running into a strange issue with SQM and can’t figure out what’s wrong.
Setup:
ISP: 1 Gbps / 100 Mbps fiber
Hardware: Nokia ONT (from ISP) → Cat6e → GL.iNet Flint 3 router (PPPoE)
LAN: 10.0.0.0/24, no VLANs (wired and wireless devices are in the same subnet)
What works:
SQM off → everything works normally (LAN + Wi-Fi).
SQM on → LAN devices work perfectly, performance improves as expected.
Wi-Fi clients connect fine and get IPs from DHCP.
Wi-Fi clients can reach LAN devices (e.g. Android phone can ping my PC).
DNS resolution works (AdGuard Home responds to queries from Wi-Fi devices).
What doesn’t work (with SQM enabled):
When opening a browser on a Wi-Fi client and trying to load a website (e.g. google.com), the request hangs and no response comes back.
Disabling SQM immediately fixes the issue.
config queue
option enabled '1'
option interface 'eth0' # WAN port connected to the Nokia ONT
option download '870000'
option upload '95000'
option debug_logging '0'
option verbosity '5'
option qdisc 'cake'
option script 'piece_of_cake.qos'
option linklayer 'none'
Has anyone seen something like this before? Could it be related to the PPPoE setup, the Flint 3 hardware, or a misconfigured SQM interface?
It appears you are using firmware that is not from the official OpenWrt project.
When using forks/offshoots/vendor-specific builds that are "based on OpenWrt", there may be many differences compared to the official versions (hosted by OpenWrt.org). Some of these customizations may fundamentally change the way that OpenWrt works. You might need help from people with specific/specialized knowledge about the firmware you are using, so it is possible that advice you get here may not be useful.
Ask for help from the maintainer(s) or user community of the specific firmware that you are using.
Provide the source code for the firmware so that users on this forum can understand how your firmware works (OpenWrt forum users are volunteers, so somebody might look at the code if they have time and are interested in your issue).
If you believe that this specific issue is common to generic/official OpenWrt and/or the maintainers of your build have indicated as such, please feel free to clarify.
Yes, good point, well possible that the ingress ifb does not take hold. I hope to switch to a OpenWrt24 based release on my router soon, then I can try to explore the options, at the very least SQM-scripts should give a verbose, preferably actionable, error message for this case.
Mind you, I hope that tc -s qdisc might show something here...
Thanks so SQM does actually work, but only from LAN, but in capacity test you get the expected numbers? around 870/95?
Then this looks indeed more like a gli-net case than a normal sqm issue... if I had to bet, I would guess that there is some special accelerator in use (e.g. between WiFi and Ethernet) that is not fully compatible with sqm-scripts.
That is more like it
I note that most speedtests are somewhat inclined to report overly large numbers and typically screw up properly accounting for the true throughput of several parallel flows especially they do not take retransmissions into account. The result of that is that the true measurement windows as taken by the speed test do not match the actual time window for the received data. The test then reports the maximum (or a similar optimistic statistic) over all its arbitrary measurement windows, which is technically not completely incorrect, but it does not really report the true sustained rate a link is capable of, but a somewhat inflated correlate of that.
Anyway, in light of that getting 818/93 of 824/91 seems sort of in the right ballpark.
That confirms that sqm works for LAN, but says little for WiFi.
GL.iNet has confirmed that this is a known conflict:
“This SQM conflict is caused by hardware acceleration. Please confirm that hardware acceleration is disabled.”
After disabling hardware acceleration, Wi-Fi clients regained internet access. However, with SQM enabled, my speed tests now drop to around 350–500 download and 80–90 upload.
I would try to enable irqbalance and receive packet steering (likely on all CPUs), this might simply be a case of overloaded CPU once the accelerators are not helping any more.