Low SQM performance with layer cake

Hi all, i'm experimenting cake with layer cake and I have problem when loading youtube videos, some videos even won't load at all. piece of cake run smooth. Here's my config (my CPU is MT 7620):

config queue
	option debug_logging '0'
	option verbosity '5'
	option enabled '1'
	option interface 'pppoe-wan'
	option download '20000'
	option upload '20000'
	option qdisc 'cake'
	option script 'layer_cake.qos'
	option qdisc_advanced '0'
	option linklayer 'ethernet'
	option overhead '8'

Try disable offloading on your interface.

ethtool -K eth0 tso off gso off gro off

Could you please post the output of:

  1. tc -s qdisc
    after a fresh reboot of the router , but after sqm with layer cake initiated and

  2. tc -s qdisc

after testing with critical youtube videos.

Also try with setting download speed to 0 to disable ingress shaping for testing, with some devices sqm currently has issues with ingress shaping, your might be one of those...

test with lower download/upload values as you did in the past (ie. use 18000 down/18000 up)

Here's output:
1, After reboot:

root@17CTuXuong:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 1345634 bytes 5970 pkt (dropped 0, overlimits 0 requeues 4)
 backlog 0b 0p requeues 4
  maxpacket 666 drop_overlimit 0 new_flow_count 8 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev wlan1 root
 Sent 3805366 bytes 3844 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev wlan1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 2473 bytes 6 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 3802893 bytes 3838 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8007: dev pppoe-wan root refcnt 2 bandwidth 20Mbit diffserv3 triple-isolate rtt 100.0ms raw
 Sent 750800 bytes 4429 pkt (dropped 0, overlimits 248 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 41920b of 4Mb
 capacity estimate: 20Mbit
                 Bulk   Best Effort      Voice
  thresh      1250Kbit      20Mbit       5Mbit
  target        14.5ms       5.0ms       5.0ms
  interval     109.5ms     100.0ms      10.0ms
  pk_delay         0us       1.6ms        25us
  av_delay         0us       156us         2us
  sp_delay         0us         1us         1us
  pkts               0        4388          41
  bytes              0      746268        4532
  way_inds           0          20           0
  way_miss           0         449          10
  way_cols           0           0           0
  drops              0           0           0
  marks              0           0           0
  sp_flows           0           1           0
  bk_flows           0           1           0
  un_flows           0           0           0
  max_len            0        5808         397
qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ----------------
 Sent 4156931 bytes 4308 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8008: dev ifb4pppoe-wan root refcnt 2 bandwidth 20Mbit besteffort triple-isolate wash rtt 100.0ms raw
 Sent 4155395 bytes 4284 pkt (dropped 24, overlimits 4543 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 151200b of 4Mb
 capacity estimate: 20Mbit
                 Tin 0
  thresh        20Mbit
  target         5.0ms
  interval     100.0ms
  pk_delay       2.6ms
  av_delay       376us
  sp_delay        13us
  pkts            4308
  bytes        4191395
  way_inds          14
  way_miss         452
  way_cols           0
  drops             24
  marks              0
  sp_flows           1
  bk_flows           1
  un_flows           0
  max_len         1500

2, While testing:

root@17CTuXuong:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 628739152 bytes 5831926 pkt (dropped 6, overlimits 0 requeues 114)
 backlog 0b 0p requeues 114
  maxpacket 1514 drop_overlimit 0 new_flow_count 222 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev wlan1 root
 Sent 1267019868 bytes 976398 pkt (dropped 67287, overlimits 0 requeues 1315)
 backlog 109182b 1213p requeues 1315
qdisc fq_codel 0: dev wlan1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 90868 bytes 339 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 1266929000 bytes 976059 pkt (dropped 67287, overlimits 0 requeues 1315)
 backlog 109182b 1213p requeues 1315
  maxpacket 1506 drop_overlimit 67216 new_flow_count 1738 ecn_mark 0
  new_flows_len 981 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8007: dev pppoe-wan root refcnt 2 bandwidth 20Mbit diffserv3 triple-isolate rtt 100.0ms raw
 Sent 482825302 bytes 5644905 pkt (dropped 15, overlimits 36029 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 209600b of 4Mb
 capacity estimate: 20Mbit
                 Bulk   Best Effort      Voice
  thresh      1250Kbit      20Mbit       5Mbit
  target        14.5ms       5.0ms       5.0ms
  interval     109.5ms     100.0ms      10.0ms
  pk_delay         0us       252us       242us
  av_delay         0us        22us        33us
  sp_delay         0us         0us         0us
  pkts               0     5643239        1681
  bytes              0   482583406      258016
  way_inds           0      402945           0
  way_miss           0       42015         151
  way_cols           0           0           0
  drops              0          15           0
  marks              0           0           0
  sp_flows           0           0           0
  bk_flows           0           1           0
  un_flows           0           0           0
  max_len            0       35298         567
  
qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ----------------
 Sent 14513928150 bytes 10784576 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8008: dev ifb4pppoe-wan root refcnt 2 bandwidth 20Mbit besteffort triple-isolate wash rtt 100.0ms raw
 Sent 14541103486 bytes 10742706 pkt (dropped 41870, overlimits 18255890 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 381024b of 4Mb
 capacity estimate: 20Mbit
                 Tin 0
  thresh        20Mbit
  target         5.0ms
  interval     100.0ms
  pk_delay       8.0ms
  av_delay       2.9ms
  sp_delay         2us
  pkts        10784576
  bytes    14600204758
  way_inds      913528
  way_miss       42664
  way_cols           0
  drops          41870
  marks              0
  sp_flows           0
  bk_flows           1
  un_flows           0
  max_len         1500

I've tried and video still hang :stuck_out_tongue:

So you are seeing giant packets on your egress, which can be a problem (cake tries to de-segment giants into normally sized packets automatically, so that might not be a problem, also your video issue most likely id ingress related, not egress). The kernel will keep any giant it receives intact so if you want to test without giant packets in the system you need to first make sure ethtool is installed:
opkg update ; opkg install ethtool
and then run the following for all ethN (eth0, eth1, ...) in your system:
ethtool -K ethN tso off gso off gro off
(the most relevant for the other interfaces is going to be "gro off" as this is the place where incoming MTU sized packets can be assembled into a giant in the first place).

This is mostly showing defaults, but with pppoe it is quite likely that the defaults are not going to be ideal. Have a look at https://forum.turris.cz/t/how-to-use-the-cake-queue-management-system-on-the-turris-omnia/3103 for instructions how to configure cake for per-internal-IP fairness (which can replace the default triple-isolate which has less clear predictions what happens when multiple internal machines share the internet connection). It also look at https://lede-project.org/docs/howto/sqm and https://wiki.openwrt.org/doc/howto/sqm for more detailed descriptions/instructions.

Also it would be interesting to learn what kind of connection you actually have, from the uplink it looks like fiber, docsis-cable or potentially VDSL(2), the only one we can rule out is ADSL([1|2|2+] as your upload seems to be too high...

That said, if everything works well with piece_of_cake, but not layer_cake this would actually implicate your upstream more than your downstream, so I am a bit confused...

Also please post the results of a dslreports speedtest: https://www.dslreports.com/speedtest. After a free regitration you will be able to configure the testing parameters by clicking on the cog button. I would recommend the following settings:
No. download streams: 16
No. upload streams: 16
actually the number itself is not so important, but the different testing profiles this speedtest offers use different numbers of test streams making simple comarison difficult...
Hi-Res BufferBloat: (check this box)
this will give higher resolution latency probes during the speedtest, which is quite helpful in underatanding shaping issues in sqm-scripts
Upload duration: 30
Download duration: 30
longer tests are typically more reliable, 30 seconds is well above the typical 10seconds ISPs seem to optimise for, and I seem to recall 30s is the maximum dslreports happily offers (thet pay for bandwidth after all)
dodge compression: (check this box)

It would be most excellent if you could either post links to your test results or at least past in the test number ($TESTNUMBER) (that will allow others to reach your test results via https with the following address www.dslreports.com/speedtest/$TESTNUMBER

Best Regards

My connection type is GPON :stuck_out_tongue: here's my report:

I'm testing other options.

That looks relatively nice except for a few nasty latency spikes in both up- and download direction. If these are not artifacts of the speedtest's latency probes these might be related to your issue. BTW the tc -s qdisc lines above indicate that your luci-app-sqm and sqm-scripts might be older than required (newer sqm-scripts default to using cake's in-built overhead accounting method, but that really only has advantages for ATM networks...).

Now could you do the same test with piece_of_cake instead of layer_cake (or even better runn this test with both qos-scripts versions while also recreating your youtube problem in te background, it would be best to use a different computer for the speedtest than for youtube...)

Ah, i checked script version and it's really old (from June 2016). I've reinstalled new version and everything run fine until now. Thanks for remind me the version :grin: Here's my recent tc -s qdisc output:

root@17CTuXuong:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 181000185 bytes 1408019 pkt (dropped 1, overlimits 0 requeues 10)
 backlog 0b 0p requeues 10
  maxpacket 542 drop_overlimit 0 new_flow_count 27 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev wlan1 root
 Sent 1718402483 bytes 1272801 pkt (dropped 42, overlimits 0 requeues 185)
 backlog 0b 0p requeues 185
qdisc fq_codel 0: dev wlan1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 54308 bytes 161 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 1717828669 bytes 1272262 pkt (dropped 42, overlimits 0 requeues 184)
 backlog 0b 0p requeues 184
  maxpacket 1506 drop_overlimit 0 new_flow_count 191 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 519506 bytes 378 pkt (dropped 0, overlimits 0 requeues 1)
 backlog 0b 0p requeues 1
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 801f: dev pppoe-wan root refcnt 2 bandwidth 20Mbit diffserv3 triple-isolate rtt 100.0ms raw
 Sent 38872366 bytes 52729 pkt (dropped 565, overlimits 42278 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 114240b of 4Mb
 capacity estimate: 20Mbit
                 Bulk   Best Effort      Voice
  thresh      1250Kbit      20Mbit       5Mbit
  target        14.5ms       5.0ms       5.0ms
  interval     109.5ms     100.0ms      10.0ms
  pk_delay         0us       5.5ms        17us
  av_delay         0us       3.2ms         0us
  sp_delay         0us         0us         0us
  pkts               0       53286           8
  bytes              0    39715316        1287
  way_inds           0           9           0
  way_miss           0         430           6
  way_cols           0           0           0
  drops              0         565           0
  marks              0           0           0
  sp_flows           0           1           0
  bk_flows           0           1           0
  un_flows           0           0           0
  max_len            0       17472         296
qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ----------------
 Sent 70579489 bytes 68591 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8020: dev ifb4pppoe-wan root refcnt 2 bandwidth 20Mbit besteffort triple-isolate wash rtt 100.0ms raw
 Sent 70366179 bytes 68073 pkt (dropped 518, overlimits 94125 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 96768b of 4Mb
 capacity estimate: 20Mbit
                 Tin 0
  thresh        20Mbit
  target         5.0ms
  interval     100.0ms
  pk_delay       9.5ms
  av_delay       4.7ms
  sp_delay         9us
  pkts           68591
  bytes       71128217
  way_inds          40
  way_miss         424
  way_cols           0
  drops            518
  marks              0
  sp_flows           1
  bk_flows           1
  un_flows           0
  max_len         1500

You still seem to have GRO enabled on one of the interfaces? The 17472 should be much closer to 1500 otherwise, at 20Mbps that takes (17472*8) / (20 * 1000^2) = 0.0069888 or roughly 7ms, which actually might not be a big problem, but you will exercise cake's giant disassembly code paths which might not be as well tested as the boring MTU 1500 ones.... But as long everything just works with the newer version, I will just let it rest. (That said your old version, AFAICT does not have known bugs causing your observed behavior, so I fear that your problem might crop up again as the root cause still is unknown).

Ok, the problem still persist, it just take more times to stall. i'm using piece of cake for now until layer cake stable enough to use. Thanks for your quick support :grin:

The really weird thing is that piece_of_cake and layer_cake use the same besteffort diffserv-profile for ingress so there should be no real difference in shaping of the incoming video bursts... The big difference between the two is how they handle outgoing/egress traffic...

Hi again. Now my SQM script run pretty good but I have a question about eth offload. I created a file to turn off all offload on all interface at boot. I double check by ethtool to make sure it work but max lenght still very high. Is it normal ?

root@17CTuXuong:~# tc -s qdisc
qdisc noqueue 0: dev lo root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 1036986038 bytes 8753954 pkt (dropped 0, overlimits 0 requeues 16)
 backlog 0b 0p requeues 16
  maxpacket 1514 drop_overlimit 0 new_flow_count 22 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev br-lan root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.1 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc noqueue 0: dev eth0.2 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc mq 0: dev wlan1 root
 Sent 480838068 bytes 569190 pkt (dropped 1, overlimits 0 requeues 77)
 backlog 0b 0p requeues 77
qdisc fq_codel 0: dev wlan1 parent :1 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 156236 bytes 445 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :2 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :3 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 480681832 bytes 568745 pkt (dropped 1, overlimits 0 requeues 77)
 backlog 0b 0p requeues 77
  maxpacket 1506 drop_overlimit 0 new_flow_count 78 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc fq_codel 0: dev wlan1 parent :4 limit 10240p flows 1024 quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
  maxpacket 0 drop_overlimit 0 new_flow_count 0 ecn_mark 0
  new_flows_len 0 old_flows_len 0
qdisc noqueue 0: dev wlan0 root refcnt 2
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8004: dev pppoe-wan root refcnt 2 bandwidth 19Mbit diffserv3 dual-srchost nat rtt 100.0ms raw
 Sent 768650103 bytes 8392081 pkt (dropped 697, overlimits 122388 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 415272b of 4Mb
 capacity estimate: 19Mbit
                 Bulk   Best Effort      Voice
  thresh      1187Kbit      19Mbit    4750Kbit
  target        15.3ms       5.0ms       5.0ms
  interval     110.3ms     100.0ms      10.0ms
  pk_delay         0us        32us        19us
  av_delay         0us         8us        10us
  sp_delay         0us         0us         0us
  pkts               0     8382436       10342
  bytes              0   768463166     1209491
  way_inds           0      263291           0
  way_miss           0       86601         208
  way_cols           0           0           0
  drops              0         697           0
  marks              0           0           0
  sp_flows           0           0           0
  bk_flows           0           1           0
  un_flows           0           0           0
  max_len            0       64128         584

qdisc ingress ffff: dev pppoe-wan parent ffff:fff1 ----------------
 Sent 20733493985 bytes 15534602 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc cake 8005: dev ifb4pppoe-wan root refcnt 2 bandwidth 19Mbit besteffort dual-dsthost nat wash rtt 100.0ms raw
 Sent 20764223636 bytes 15466910 pkt (dropped 67673, overlimits 28134512 requeues 0)
 backlog 26334b 19p requeues 0
 memory used: 471744b of 4Mb
 capacity estimate: 19Mbit
                 Tin 0
  thresh        19Mbit
  target         5.0ms
  interval     100.0ms
  pk_delay      12.7ms
  av_delay       7.9ms
  sp_delay       5.3ms
  pkts        15534602
  bytes    20857770801
  way_inds      220093
  way_miss       93364
  way_cols           0
  drops          67673
  marks              0
  sp_flows           0
  bk_flows           1
  un_flows           0
  max_len         1500

This is my script:

#!/bin/sh /etc/rc.common

START=99

start() {
	local count=1
	local eth
	while true ; do
		eth=`ls /sys/class/net | grep 'eth' | sed -n "$count"p`
		if [ "$eth" == "" ]; then
			break
		fi
		ethtool -K $eth tso off gso off gro off
		if [ $? -eq 0 ]; then
			logger -s -t disable_offload_init -p daemon.notice "$eth offload turned off"
		else
			logger -s -t disable_offload_init -p daemon.err "[ERR] $eth offload turn off FAILED: '$?'"
		fi 
		count=$((count+1))
	done
}

And tso, gso and gro turned off:

root@17CTuXuong:~# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
        tx-checksum-ipv4: on
        tx-checksum-ip-generic: off [fixed]
        tx-checksum-ipv6: on
        tx-checksum-fcoe-crc: off [fixed]
        tx-checksum-sctp: off [fixed]
scatter-gather: on
        tx-scatter-gather: on
        tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
        tx-tcp-segmentation: off
        tx-tcp-ecn-segmentation: off [fixed]
        tx-tcp6-segmentation: off
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-ipip-segmentation: off [fixed]
tx-sit-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]

This is on egress only. The way to stop this is to disable GRO on all of the router's interfaces (wireed and wireless) as the kernel will keep an already assembled giant packet intact. That said, cake actually attempts to dissect the giants into reasonable sized "chunks" so that latency under load should still be fine. My recommendation would be to leave things as they are unless you notice unwanted behavior under load... (Really GRO and GSO are not bad things in themselves, they help the kernel to better scale to high packet rates, so disabling them without the need to do so will make your system less efficient).

Hope that helps