So, unllike @dlakelan I did not actually try to implement anything like this at all, so can really just speculate, so take my words with caution.
I understand that what you probably desire is to be able to configure a hard minimum and hard maximum shaper rate (per direction) and want to keep the shaper at a minimum, unless the network is actually used and the increased shaper setting does not compromise latency-under-load/bufferbloat mitigation too much. I hope that this is a decent description of your issue?
In that case, I see the need for two measurements, "network load" and "bufferbloat" on which to base the actual shaper decisions.
"network load ratio": easy to measure periodically sample a counter (either from an interface or via tc) and divide the number of bytes by the duration of the sampling interval, and then divide this by the shaper rate for a give direction.
"bufferbloat": trickier, but @dlakelan has found a decent algorithm, sample a X known good ICMP echo reflectors and assess their RTT increase relative to a slowly updated baseline, only assume access link congestion, when Y out of the X RTT sources indicate latency increase. Sampling just a single source like 8.8.8.8 will cause false negatives, if you can live with those set X to one otherwise "voting" seems to be a decent method, especially since access link congestion will result in all RTTs being increased (so you could set Y = X).
Let's assume you generate estimates of these two quantities every 30 seconds or so (TCPs generally take a few RTTs to adjust to changing rates, so there is a lower limit at which changing the shaper makes sense (especially for ingress)).
Now, I would probably do the following logic (modulo all the bugs I m sure to introduce):
INIT:
set current_load_sample_time = now
set last_load_sample_time = now
set last_transfer_volume = read transfer counters
set ingress_shaper_rate = read SQM's ingress shaper rate
set egress_shaper_rate = read SQM's engress shaper rate
for all RTT_IPs get the current_average_RTT for 10 samples
and set last_average_RTT to current_average_RTT
-> save all of these out to /tmp in a file
RUNTIME
1) get timestamp t(start)
2) read in data from file (if file does not exist or values are missing, run the INIT function and goto)
3) for all RTT_IPs get the current_average_RTT for 10 samples (in parallel)
4) wait for completion and calculate: dRTT = current_average_RTT - last_average_RTT
update last_average_RTT (as exponentially weighted moving average to give it some persistence, here alpha 0.1 but that needs tuning, probably):*
last_average_RTT = (1 - 0.1) last_average_RTT + 0.1 * current_average_RTT
if dRTT > bufferbloat_threshold
-> bufferbloat detected need to reduce the shaper rate, unless we can differentiate
between ingress and egress congestion, we need to reduce both shapers:
set shaper_rate = max(min_rate, shaper_rate - (max_rate - min_rate)/4)
5) get current_transfer_volume
6) get timestamp t(current_load_sample_time)
7) calculate load_ratio per direction: ((current_transfer_volume - last_transfer_volume) / (t(current_load_sample_time) - t(last_load_sample_time))) / (shaper rate)
set last_load_sample_time = current_load_sample_time
8) IF dRTT < bufferbloat_threshold AND current_load_ratio >= load_threshold
increase shaper rates: set shaper_rate = shaper_rate + (max_rate - min_rate)/8
Note: for stability it seems best to increase with a lower increment then to decrease, but the exact numbers probably need tuning
9) IF dRTT < bufferbloat_threshold AND current_load_ratio < load_threshold
slowly decay to the minimum shaper rate if no load and no bufferbloat detected
set shaper_rate = max(min_rate, shaper_rate - (max_rate - min_rate)/8)
10) write out all parameters to backing store and and sleep for 30 - (t(end) - t(start))
This obviously needs to be integrated with init.d so it gets automatically restarted if accidentally killed. And all of the numbers need tuning.
*) Here the trick is to select alpha such that transiently increased RTTs when the shaper is set too high do not significantly pull up the reference RTT in the time it takes this script to adjust the shaper such that the RTT gets under control again, the idea is that this allows the reference RTTs to adjust to path changes
None of this is tested, and almost all numerical values need to be tuned/selected properly. Shell does not allow floating point arithmetic so a big question is what run time environment to use.
I guess once al of this is implemented it will look pretty close to @dlakelan's solution, except it will be biased towards the configured minimum rate not the maximum (then again, I have not looked too closely into the erlang script -ECANNOTREADERLANGFLUENTLY
so I might be wrong in the similarity between the approaches).