monitor_achieved_rates_interval_ms=250 # (milliseconds)
bufferbloat_refractory_period_ms=2000 # (milliseconds)
decay_refractory_period_ms=1000 # (milliseconds)
But indeed, there is a need to re-evaluate all of that, because of the new antennas.
monitor_achieved_rates_interval_ms=250 # (milliseconds)
bufferbloat_refractory_period_ms=2000 # (milliseconds)
decay_refractory_period_ms=1000 # (milliseconds)
But indeed, there is a need to re-evaluate all of that, because of the new antennas.
You would like something different. Namely when Discord traffic is detected you want a higher minimum rate to safeguard against rates being dropped too low. Earlier on in this thread two excellent ways to do this were identified using either tc stats or netfilter rules. I haven't implemented this yet.
But that is not ho this works... if you implement this and someone else uses two greedy TCP streams in the same direction, discord will crater as well not because of total shaper rate, but because its capacity share of the total is too small...
I do not believe that such a system would work well automatically... what might work is to prepare to configs, one for work and one for apres-work and manually switch between them.
Let me explain why I am not a fan of this idea: it needs configuring a rule to detect Discord traffic, and an action to set the minimum rate to some number, which presumably depends on the expected number of people in the call. While deleting some misbehaving (for me) code achieves the same goal of working Discord without any configuration.
My take is that clearly the default config is not ideal for your desires/requirements, the question is what are the side-effects of this change and how will it operate with different traffic patterns on less marginal links.
THe defaults are 200ms and 300ms IIRC much more of a problem, your 250/2000 look indeed fine-ish... except you now also apply this on rate increases.... (during bloat-epochs).
My take is that clearly the default config is not ideal for your desires/requirements, the question is what are the side-effects of this change and how will it operate with different traffic patterns on less marginal links.
Good question, but I think that I am not the best person to answer that. Can we get a tester with a less-marginal link, who agrees to apply the same patch (cake-autorate.sh.diff
from this shared folder) and deliberately misconfigure min_rate
? Or reach out to somebody who previously complained about the rates going to the minimum and staying there?
I can test? What do I do/set? I still don't know what your proposal is? Set shaper rate based on only 0.9*achieved rate during bufferbloat and then what about during load high and no bufferbloat? Sorry if answered before I've been mostly working and only looking at emails sporadically.
Apply the patch. Deliberately misconfigure min_rate
to the lower value. Do nothing else. If it ends up unstable, increase both refractory periods by the same factor.
Regarding high load and no bufferbloat, the patch doesn't change the behavior. This was unintentional, but let's keep it like this so far.
Just in case if Discourse does not send you emails about edits: the patch is cake-autorate.sh.diff
in this pCloud folder. Please ignore all other files there for the purpose of testing, and definitely do not apply the config found there.
I am trying to think how this will work with bursty traffic and local highs and lows. Remember my monitoring interval is short 200ms I think?
Apply the patch. Deliberately misconfigure
min_rate
to the lower value. Do nothing else. If it ends up unstable, increase both refractory periods by the same factor.
And do not, reapeat do not try to actually cause upload saturation. That patch will set the shaper to 100% of achieved rate and for egress that is not really a good idea... as achieved_rate >= bottleneck_rate and on bufferbloat we need to set shaper <= bottleneck_rate.
My monitoring interval is 250 ms, so not a big difference. Again, do not try to replicate my config. The question is whether the change is compatible with your existing config, modified by decreasing min_rate
.
The patch will set the upload shaper to 90% of the achieved rate by default. I have not removed achieved_rate_adjust_down_bufferbloat
from the equation.
The patch will set the upload shaper to 90% of the achieved rate by default.
Thanks for the clarification.
Actually, I would appreciate if, instead of applying the patch literally, you implement both the old and the new way to calculate the new shaper rate, and DEBUG both, while actually applying the rate that my patch says to apply.
If you redo the patch, make sure the pass the direction into get_next_shaper_rate() and only apply these changes for ingress, as for egress these will actually do the wrong thing...
Thanks for the reminder. Out of curiosity, if I redo the patch, I will log both the right thing and the wrong thing anyway. But let me see what might be the result of the wrong way.
By default, we have:
achieved_rate_adjust_down_bufferbloat=0.9
shaper_rate_adjust_down_bufferbloat=0.9
Let's assume that achieved_rate_adjust_down_bufferbloat <= shaper_rate_adjust_down_bufferbloat
, which is true both by default and in the configuration that sets shaper_rate_adjust_down_bufferbloat
to 1 while leaving achieved_rate_adjust_down_bufferbloat
at the lower value.
Let's denote achieved_rate_adjust_down_bufferbloat
as C
and shaper_rate_adjust_down_bufferbloat
as D >= C
.
On egress, the achieved rate is logically below the shaper rate, but, due to laggy measurements (and only for this reason), on paper, this is not guaranteed. Anyway, after any adjustment of the shaper rate, the script enters a refractory period, and during a refractory period, the shaper is not adjusted. So it is still relatively safe to assume, as a first approximation, that the lag does not apply, and achieved_rate_kbps < shaper_rate_kbps
.
The old way is (back to floating point in pseudo-C):
adjusted_achieved_rate_kbps = achieved_rate_kbps * C;
adjusted_shaper_rate_kbps= shaper_rate_kbps * D;
shaper_rate_kbps= (adjusted_achieved_rate_kbps > min_shaper_rate_kbps &&
adjusted_achieved_rate_kbps < adjusted_shaper_rate_kbps ) ?
adjusted_achieved_rate_kbps : adjusted_shaper_rate_kbps;
Let's simplify. We know that achieved_rate_kbps < shaper_rate_kbps
and C <= D
. Therefore, adjusted_achieved_rate_kbps < adjusted_shaper_rate_kbps
. Therefore, the second part of the condition is always true on egress. So, effectively, in the right way we have:
adjusted_achieved_rate_kbps = achieved_rate_kbps * C;
shaper_rate_kbps= (adjusted_achieved_rate_kbps > min_shaper_rate_kbps) ?
adjusted_achieved_rate_kbps : (shaper_rate_kbps * D);
Now the wrong way:
adjusted_achieved_rate_kbps = achieved_rate_kbps * C; /* same as in the right way */
shaper_rate_kbps = (adjusted_achieved_rate_kbps > min_shaper_rate_kbps) ?
adjusted_achieved_rate_kbps : shaper_rate_kbps;
So effectively, the difference is that, if adjusted_achieved_rate_kbps < min_shaper_rate_kbps
(e.g., if the upload is idle), in the right way, the egress shaper decays according to D
, and in the wrong way, it doesn't. For non-idle upload (and thus for Discord), there is no difference. If D = 1
, there is no difference at all.
the achieved rate is logically below the shaper rate,
Not exactly, achieved_rate <= shaper_rate (if measured at shaper_egress, otherwise even achieved_rate > shaper_rate is possible). Also it is not only "laggy measurements" but the fact that we aggregate over epochs meaning any achieved_rate measurement can contain 2 shaper settings and resulting traffic rates. That in turn means that even for measurements at the shaper egress essentially achieved_rate <=> shaper_rate... as I keep repeating the achieved_rates are not precise and it is IMHO best to treat them as just that estimates with inherent uncertainty...
So it is still relatively safe to assume, as a first approximation, that the lag does not apply, and
achieved_rate_kbps < shaper_rate_kbps
.
This is a bit of wishful thinking, let's model this cow as a sphere of 1m radius on a frictionless plane...
Anyway, I think I accept your request for a new policy option "trust ingress achieved_rates unconditionally" (I for one can safely leave that option disabled and ignore it, more adventurous minds or those on marginal links can do differently...), but let's do that correctly and not hack and slash with unexplained terms like right and wrong (seen as a policy issue there is no right or wrong).
I had no time to work on this. Therefore, today's test is just a retest with exactly the same parameters as yesterday, except that there were more people in the Discord call, and the latency target has been reduced:
dl_delay_thr_ms=100 # (milliseconds)
ul_delay_thr_ms=100 # (milliseconds)
Perhaps this is too aggressive for my LTE connection.
Unfortunately the "unusually good state of the LTE link" objection to the yesterday's results cannot be dealt with. Here is the speedtest result without SQM, which is, on paper, better than yesterday, but de-facto, during the call, it was worse:
Use Speedtest on all your devices with our free desktop and mobile apps.
Waveform bufferbloat test got an F, as expected.
View the full results, and test your own bufferbloat
The meeting nevertheless was not 100% successful, because the link by itself was not good enough. Subjectively, the call went much worse than yesterday, but still not bad enough to turn the script off. There were 9 people, out of them 4 with video. There were periods of extreme stuttering, so I missed some parts of the conversation (e.g. "we need to ask the cust... omer... which... <missed some words> they want"), but Discord has recovered by itself after all such stuttering periods - which is the important point here. There were also some periods when Discords showed a "RTC Disconnected" red message, while still delivering some voice and video with stuttering. There are also quite lengthy periods where the incoming bandwidth is significantly higher than 100% of the shaper rate, thus confirming the theory.
Perhaps the same logic of trusting the incoming rate as a known lower bound of the bottleneck bandwidth has to be extended to a no-bufferbloat case as well - but I do get the concerns about the control loop stability for the case when the control loop does work (for UDP media streams, the whole point is that it usually doesn't).
Results have been uploaded to pCloud.
Keep all your files safe, access them on any device you own and share with just the right people. Create a free pCloud account!
Perhaps the same logic of trusting the incoming rate as a known lower bound of the bottleneck bandwidth has to be extended to a no-bufferbloat case as well
Not really, what you essentially implemented is not so much "trust the achieved rate" but more a "dynamic minimal rate where the controller disengages". As long as the achieved rate stays the same the shaper rare will never decline to the minimal rate (as long as achieved_rate * factor > minimal_rate) hence "dynamic". Or put differently, I prefer "trust, but verify" which in essence is what min(achieved_ratefactor, shaper_ratefactor) gives us... but I think with bursty macs like DOCSIS, DSL (with G.INP*), and WiFi achieved rate simply can vary a lot, and once we start low pass filtering the achieved rate to deal with that artificial variability, we lost temporal fidelity of responding to bufferbloat...
My take on this is more, if you used an actually meaningful minimal rate and delay thresholds that change should not be necessary.... and you would actually be able to easily predict controller behaviour.
Here my interpretation of the "stuck on minimalrate" phenomenon is that this happens, because even at that rate the evoked delay stays above threshold and actively inhibits the controller from increasing the rate again. Is that intepretation correct or is something else happening as well?
) With G.INP upstream and modem exchange data transfer units (DTUs) and use an ACK scheme in which the receiver tells the sender which DTUs to retransmit. E.g. on downloads the modem tells the DSLAM which DTU(s) where receives incorrectly, the modem will hold all DTUs with higher "sequence numbers" in its buffers to avoid introducing packet re-ordering while requesting retransmission of the defect DTU) the DSLAM will now (iteratively) retransmit that DTU until the modem signals successful reception or until that DTU has aged out of the DSLAMs buffers. The modem will now disassemble all DTUs into packets in the same order they arrived at the DSLAM (with potentially missing packets if retransmission was not successful) and it will essentially release these packets at line rate. In a case with a bridged modem on a 1 Gbps link to a router, that burst arrives at 1 Gbps, assuming for a second that burst fits completely into our rate sampling window, we would see an achieved rate of 1 Gbps even if say the DSL link is only 250 Mbps or even just 100Mbps (and even if it does not fully align with a sampling period having such bursts in the byte accumulation is not helping with getting useful data). Following your proposal we might end up setting the shaper to 10000.9 = 900 Mbps, hardly ideal for a 100 link sure after the refractory period the achieved rate will be lower and we will drop from 900 to say 50 Mbps, but such extreme drops cause problems of their own...
So "trust the achieved rate" clearly can not mean take it ay face value... at the very least it needs to be compared to the maximum defined rate for our controller and potentially clamped to that value if larger (or some smoothing would need to happen)...
To repeat myself achieved rates are not directly actionable estimators of the bottleneck rate and I consider it risky to treat them as such.
Here my interpretation of the "stuck on minimalrate" phenomenon is that this happens, because even at that rate the evoked delay stays above threshold and actively inhibits the controller from increasing the rate again. Is that intepretation correct or is something else happening as well?
Something else is happening as well. Namely, throttling the shaper does not help the sender to slow down, or does not help quickly enough, because it is a UDP media stream (so ACKs do not exist), and switching to a different quality/resolution is not always implemented or is slow. In other words, due to the lack of effective sender feedback, the shaper rate is completely ineffective as a bufferbloat control for ingress.
Namely, throttling the shaper does not help the sender to slow down
Talk to you application supplier, that is not standards conforming behavior for traffic over the internet...
or does not help quickly enough, because it is a UDP media stream (so ACKs do not exist),
That is a red herring, sorry. UDP applications need to implement their own timely feed-back channel to adjust sending rates to network conditions. Sure TCP does that as part of the protocol, but the moment an application goes UDP the onus to do proper congestion control falls into the hands of the application developer (that said QUIC is UDP based and still implements congestion control of some sort).
In other words, due to the lack of effective sender feedback, the shaper rate is completely ineffective as a bufferbloat control for ingress.
Again and very respectfully, such applications need to be dropped like hot potatos, they are either standards conform nor safe to use over the existing internet.
If your model of a 1.1 Mbps stream over a 1 Mbps bottleneck is true then you simply continue to DOS your self, not much that autorate can do there.
With that of my chest, back to more productive discussion, are there any in application toggles to lower the video quality/bandwidth demands? Either on your side or in the configurarion of the service (cloud, your companies servers)?