R7800 SQM settings keep causing bufferbloat

Let's first figure out how much bandwidth sacrifice is actually needed, and then we can try to figure out the root cause?

Well, I tried iteratively to "optimise"these settings. Also note the option iqdisc_opts 'nat dual-dsthost ingress' line, the ingress keyword makes cake shape the ingress so that the incoming rate is targeted, which results in more aggressive shaping which in turn allows to set the shaper closer to the real limit. On egress, in theory shaping should work at 100% of the real egress rate, so shaving off 5% is already more than theoretically required, one of my issues is, that I know my ISP employs a shaper at its BNG level, but I have no information about that shaper's setting so I need to approximate those limits. (And I value keeping latency low over maxing out the bandwidth, so I could not care less about even trading 10-20% of bandwidth for consistently low latency-under-load-increase. but this is a policy issue, and I fully understand that other's have different policies.)

Mostly A+ and occasionally As, but I tend to look more at the detailed bufferbloat plots anyway, the grades are too coarse for my taste.

Well, this is one of the cool features which allows cake to get to the real internal and external addresses basically virtually undoing the NAT masquerading, and that in turn allows the nifty per-internal or per-external IP address fairness modes (in my config nat dual-dsthost in ingress and nat dual-srchost on egress fairly share download and upload bandwidth between all concurrently active internal host addresses, so no computer can easily take over the network completely)