Is the above a possible setup?
Reason why is because I think the reason ifb4eth0.2 egress is bypassed in fast-path is because fast-path bypasses egress but not ingress
The hook occurs at nat postrouting maybe that is why egress gets skipped
@moeller0
Am I right to say that SQM creates an ifb and controls only the egress
part?
Yes, in a sense. The Linux kernel, as far as I know, does not support instantiating traffic shapers on an interface's ingress side. The workaround is to redirect all ingress traffic to an ifb and instantiate the shaper on that interface's egress, which sits between the real ingress and the further kernel network stack.
In case you only want to shape a specific direction you can set the sqm bandwidth to zero for the other direction; if you set the ingress bandwidth to zero there should be no ifb interface generated...
I am puzzled, the interface sequence should not matter, BUT a bridge interface is a rotten place to instantiate a shaper (it will also affect all internal WLAN to LAN traffic as well as WLAN/LAN to WAN), so let's ignore that. But the eth0.2 result makes me scratch my head.
Could you run a dslreports speedtest for all combinations an post the results here, these tests often give enough data to get a better hypothesis what happens... (I collected a few configuration tips for the dslreports speedtest under https://forum.openwrt.org/t/sqm-qos-recommended-settings-for-the-dslreports-speedtest-bufferbloat-testing/2803, maybe they can be of help...)
Each time I run tc qdisc to check the rules are correct
interface name eth0.2 will create ifb4eth0.2 etc etc, full tc qdisc rules are copied as shown in the previous post No SQM
Okay, I see, now could you repeat a speedtest with sqm instantiated with piece_of_cake/cake on eth0.2 but capture the output of "tc -s qdisc" before and after that speedtest, please. In addition it would be great if you could also add the output of "ifconfig" from before and after the speedtest. And finally it would be great if you could post the link to the detailed results for that speedtest (or the link form the sharing section of the detailed results, which also contains the test's id number, but given how clear the effects are that yuo see this will only help to satisfy my curiosity, it should not matter for the issue at hand). Finally I assume that on your test machine none of the fast-path/shortcut modules are active during the test, correct?
In all my test SFE is running, the aim is to enable SFE to work with rate-limiting on SQM.
The thing is with SFE on SQM-Scripts works on eth0.1 up and down but on eth0.2 only up is working which is puzzling
Before running dslreports speedtest interface eth0.2 ingress 50Mbits/s egress 50Mbits/s
So if I compare the ingress number of packets (bytes) I get:
eth0.1: 124995 - 108 = 124887 (124489978 - 14768 = 124475210 / 1000^2 = 124.47521 MB)
eth0.2: 9102 - 219 = 8883 (8138237 - 74421 = 8063816 / 1000^2 = 8.063816 MB)
Now, assuming the test is otherwise of the same duration and the throughput on eth0.2 ingress is actually 200 instead of 50, I would say the SFE module steals packets from qdiscs instantiated on an ifb (sort of making the tc action redirect only partially working). I had expected no packets actually hitting ingress cake at all, but it seems the SFE module will not trigger for all packets... Now, it would have been nice to also see how the packet and byte counters exposed by "ifconfig" would have changed for the same test...
In short it looks like the shortcut module also (partially) shortcuts the ifb.
One work-around might be to not use sqm on ingress at all; instead of shaping the WAN interface's egress and ingress one can also shape on the egress of the interface that connects the router SoC with the LAN switch (in that case one needs to configure "upload" only for both independent sqm instances). This will obviously not work well with SoCs where the WLAN interfaces are directly connected (so most of them), as WLAN packets will side-step the internet download shaper. But for testing that should work out...
One more thing to test would be whether cake's deNATing which is required for per-IP-fairness still works when NAT is dome in hardware (I assume SFE does hardware NAT, is that correct?)
Yes I also thought of separating it into eth1.0 and eth0.2 but will in affect lan speeds?
Assuming on LAN I have 1Gbits between localhost but I only have 250Mbits to and from WAN, by applying WAN limits on LAN, LAN speeds will be affected?
I am trying to see if there is a clean way of applying the old eth0.2 and make sure it still obeys the limit when set by SQM.
SFE is actually not Hardware NAT it simply creates and connects interface like eth0.2 and br-lan and shortcuts NAT between them bypassing a lot of layers.
I am assuming that when applying SQM to eth0.2 only downloads get shortcut, uploads are still hitting SQM but in eth0.1 case both uploads and downloads hit SQM, so can we replicate this for eth0.2?
Depends on your router's architecture. If the SoC's LAN port connects to a switch that supplies the other LAN ports than the switch obviously will not care about the sahper on the switch's port to the CPU. But as I said, often the WLAN radios are directly connected to the SoC so their traffic will not be handled by the shaper, which in essence makes the shaper ineffective in controlling bufferbloat.
Well your tests show that both the WAN interface which does NAT and the bridge interface have ingress shaper/ifb issues; as far as I can tell these are also the places where the shortcut module gets active. So my hypothesis is that the SFE module will, if active, side-step the IFB entry point and hence only a few packets (presumably packets not handled by the shortcut module, BTW there might be counters to check whether the number of packets hitting ingress cake are similar to the number of packets that are not going through the short cut module?).
I guess that would require figuring out why most packets with SFE are "routed" around sqm's ingress IFB...
mmmh, I somehow have the feeling you have read https://unix.stackexchange.com/questions/288959/how-is-the-ifb-device-positioned-in-the-packet-flow-of-the-linux-kernel already (or maybe you wrote it yourself). But I have no more insight about where the packet stealing happens, but the data we have indicates, that SFE basically steals the packets earlier, so that there is nothing left for the ifb to steal. The fact that you still saw some residual packets on the ifb is probably due to the fact that SFE (AFAICT) will only start stealing after 128 packets have been send (per flow?)...