i did use the 40 bytes overhead but i still get occasional bufferbloat/spikes, is there any more options that i should turn on for my dsl ? let me mention that i get better bufferbloat results (less massive spikes) if i use None in link layer adaptation even though my line is indeed ATM enabled
Ermm, could you repeat the ping_collector step with SWEEP_N_ATM_CELLS=4, it would be really helpfuk to see more than one step transition in the data... (the default should be SWEEP_N_ATM_CELLS=3, so I am puzzled why we only see this little amount of data; could be a change to the script or a bug)...
Ah, I see you used the windows version of the collector-script, that sees very little testing (I only have windows in a VM and I rarely start that up); I guess I should look inside that script again to figure out why it only seemed to have tested for around 50Bytes instead of 48*3 = 144. About the size of the files, no real idea either, I would guess that longer runtime should result in larger files, but without seeing the files I can only wildly speculate which is not going to help anybody.
Anyway, that plot now strongly indicates ATM cell encapsulation with an overhead of 40 Bytes.
You might want to add "ingress" to the list of cake keywords in the iqdisc_opts field, to be more robust against multiple concurrent ingress flows; that said at 3000/640 this is not going to make massive improvements.
BTW, what modem do you use and what statistics does your modem report? Is that modem running in bridged mode or did you have to configure a username and password (for PPPoE perhaps) on that modem?
i'm running Openwrt on a raspberry pi 3b, router is in pppoe mode wifi off and the raspberry pi connected on lan port 1 which has DMZ enabled the pi is the only AP available with SQM enabled on it, by statistics do you mean the speed (4096 down/1024 up if so) ? or full connection statistics ? would it be better if i use a modem instead and do the PPPoE connection in OpenWrt ?
Preferably the full connection statistics (and if possible the name and version of the modem).
That depends, if the Pi does the PPPoE decapsulation it will be able to detect connection hangs, has more freedom to hand out DHCP addresses and/or you do not need double NAT. But for your ping spike issue, this change will not necessarily help...
The error counters, which I hoped to look at in detail are unfortunately a bit too terse (though 0/104 CRC errors is really low, unless the uptime of the link was well below one hour, but you are at 21915/3600 = 6.1 hours already, so the link seems to be clean).
Thanks, it is clear to me that SQM really does help your link to be usable under load. Now, lets try to tackle the ping spike issue.
Maybe a packet trace or the output of an mtr/winmtr session running during the experienced latency spike?
How bad exactly? And one thing to try when things run badly, would be a) a dslreports speedtest, and a dslreports speedtest after halving both the ingress and egress shaper rate (to check whether the issue might be related to your DSLAM's uplink, in that case sqm on your router will need to be set to the lowest reliably reachable shaper rates), Also it would be good to monitor the modem's error counters, line instabilities and CRC errors will cause havoc to your data transfers.
But in reality all these hypothesis are rather weak, but without alternatives maybe still worth researching.
I see, could you run top -d 1 during one of your tests tat shows ping spikes and look at the sirq percentage of the pi? For testing you could run a dslreports speedtest, or even the fast.com speedtest (that also allows to configure upload testing as well as longer run-times) while visually monitoring sirq.
Or you could try the speedtest package under https://forum.openwrt.org/t/speedtest-new-package-to-measure-network-performance/24647 as that will also look at the CPU load during the test.
That points to a bug or overload somewhere, thanks to your measurements we have proof of the overhead being 40 bytes, and encapsulation being ATM so pretending this was not true will run the shaper at higher rates than you would think and especially with small packets, not accounting for ATM cell encapsulation will underestimate effective packet-size by 505 to 33% and that will make your shaper ineffective against bufferbloat if the packet size mix on your link contains to many small packets. That said, unless the "bug" is fixed I do not blame you for using the "none" option as the reason for running sqm is not being theoretically correct but being practically better.
I would just like to figure out why link-layer accounting does not seem to work for you...