SQM makes bufferbloat worse

You could try to use a build with fast path included.

It seems like it also does increase SQM performance.
In case you want to try it: