This is actually a no-op, only the ingress keyword changes cake's behavior, I did not even realize we had "egress" as keyword, but sure, we do and it is the default.
This is a bit of a mixed blessing, ack-filter does help with highly asymmetric links and also with highly bursty links (but cake as used in SQM typically does not see the burstiness, so this component of ACK-filtering will not come into play). But unless your uplink gets overloaded this will have very little effect on your actual issue, so by all means keep this setting ;).
Well, the crux of the matter is, that sqm replaces the ISPs under-manages and over-sized buffers with its own advanced queue/buffer management, to reduce the latency-under-load increase (aka bufferbloat). For this to work, sqm needs to only admit at maximum as much data per time into the ISP's equipment (in xDSL systems, modem/CPE and indirectly dslam/msan) as that equipment can actually transmit over the bottleneck link, so that these ISP-buffers never fill up continuously to unhealthy levels.
To do this SQM needs to calculate for each packet it admits how much time/instantaneous bandwidth this is going to require on the bottleneck link. For packet based data transmission each packet carries a payload, as well as a bit of overhead required to actually transport the packet (this is loosely like sending a parcel, where the packaging/labeling adds weight and volume to the content and its the combination of both that needs to fit into the carrier vehicle volume and weight wise), for sqm to make an accurate prediction of the transmission time it needs to know the payload size (which is easy as the kernel typically has that information at hand) as well as the applicable overhead. And that second value is tricky to get, as the SQM-host might not be directly connected to the actual bottlenech link and hence is in no position to know the actual overhead itself. This is why we need to manually configure that per-packet-overhead, it is also immensely tricky to empirically measure that overhead robustly and reliably (we have a method that works for ATM/AAL5 based carries, but these are a dying breed, land IMHO rightly so).
Now, what happens if the overhead is under estimated? Typically people are advised to set the per-packet-overhead to the best of their knowledge (and err on rather a bit too much) and then measure the bufferbloat resulting from different shaper bandwidth settings. This is a reasonable approach, but let's see what happens when we under estimated the per-packet-overhead (for demonstration purposes I am estimating this as 0, but the principle will hold for any under estimation, just the consequences will be rarer/subtler), I will shamelessly use simple values here but assume VDSL2
gross-rate * optional-encoding * ((payload size) / (payload size + per-packet-overhead)) = goodput (~speedtest result)
"Optional encoding" differs between link technologies and equals 64/65 for VDSL2@PTM
Side-note: ATM/AAL5 is weirder and can not really be modeled with a simple encoding factor, but that is not your issue.
So assuming a IPv4/TCP measurement without any extras and for a true bottleneck gross rate of 100 and a true per-packet overhead of 30 on VDSL2 we get a goodput of:
100 * 64/65 * ((1500-20-20) / (1500 + 30)) = 93.96
if we use this as our real achievable top-speed we can calculate which shaper gross rate we would need if the per-packet-overhead is set to 0 instead of 30:
93.96 * 65/64 * ((1500)/(1500-20-20)) = 98.04
setting the shaper to 98.04 units will control bufferbloat, BUT only if the paket size is 1500 Bytes. If we just redo our calculations for a packet size of 100 bytes we get:
100 * 64/65 * ((100-20-20) / (100 + 30)) = 45.44
and
45.44 * 65/64 * ((100)/(100-20-20)) = 76.92
but since we set the shaper to 98 we will be admitting too much into the ISP's devices and hence bufferbloat will increase again. Depending on your actual mix of packet sizes on your link this issue will be more or less prominent, but it always lurks as a danger-pit unless your per-packet-overhead is equal or larger than the real per-packet-overhead. I hope this answers your question.
Hard to say, as above, I have no simple and reliable way to actually empirically measure the applicable per-packet-overhead, but according to ITU specs, VDSL2 will only give you 22 bytes of overhead (PPPoE would add another 8, but IPoE does not use PPP tunneling), an potential VLAN tag would add another 4 bytes (and some ISPs use double VLAN tagging). I would guess that 30 should be a decent estimate with a high probability to slightly over- instead of under-estimate, so exactly what you should do.