Oscillation Prevention in CCID-3

1. Background

When a link is occupied only by a few flows (low level of `statistical multiplexing'), the sending rate of CCID-3 can start to oscillate as a result of buffers periodically filling and draining.

This problem had been identified by Joerg Widmer in section 3.6.4 of his MSc thesis as `sending rate oscillations':

"Consider an example where a TFRC flow is the only flow on an otherwise empty link with sufficient buffering. In the absence of loss events, the TFRC flow increases its rate. When the rate reaches the link bandwidth, the buffer starts to fill up and the RTT increases by the buffer delay. Eventually, the buffer will flow over and packets have to be dropped. The loss rate increases and the sender reduces its sending rate. This causes the buffer to empty and the buffer delay to decrease, which in turn leads to a rate increase. The result is a cycle of increase and decrease instead of a steady sending rate in such an environment."

The suggested solution was called `Inter-packet Space Modulation' (ISM).

A corresponding algorithm is specified in RFC 3448, 4.5. The algorithm consists of two parts:
  1. a moving average of the square root of each RTT sample;
  2. a modulation of the allowed sending rate X.
The moving average is computed as:
	If (no feedback has been received before)
R_sqmean = sqrt(R_sample);
Else
R_sqmean = 0.9 * R_sqmean + 0.1 * sqrt(R_sample);
The second part modifies t_ipi = s/X to become s/X_inst, where
	X_inst  =  X * R_sqmean/sqrt(R_sample);
Hence t_ipi is multiplied by X/X_inst = sqrt(R_sample)/R_sqmean, so that t_ipi increases when the RTT begins to grow.

2. General results

The first test was to verify what the algorithm does in a typical scenario. To emulate a bottleneck, a Token Bucket Filter was attached to the outgoing interface, reducing the link speed of 100 Mbps down to 1 Mbps:
tc qdisc add dev eth1 root tbf rate 1000kbit buffer 10000 limit 30000
The transmit queue length of eth1 was 1000 packets, which generates a noticeable amount of buffering under the given conditions.

2.1 It works

Consider first the plot of transmit rates when oscillation prevention was disabled:
Oscillation prevention disabled
The plot below was taken under the same link conditions, but with oscillation prevention enabled.
Oscillation prevention enabled
Note that the oscillation is not completely eliminated, but rather that period and amplitude are significantly reduced (in the later revisions of rfc3448bis the section name changed from "prevention" to "reduction").

2.2 Details underlying this scenario

Once the first loss occurs, the allowed transmit rate X is dominated by the computed transmit rate X_calc, which in turn is a function of the loss event rate p. Consider below the loss event rate when oscillation prevention is enabled:
Loss rate when oscillation prevention is enabled
Compare with the loss rate when oscillation prevention is disabled:
Loss rate when oscillation prevention is disabled
There is more (high-frequency) oscillation here, but this is not the main cause for oscillations of X_calc, shown next.
Computed allowed rate (throughput equation) when oscillation prevention is disabled
From the corresponding RTT plot, it is visible that the oscillation of X_calc is caused by periodic changes in the RTT:
RTT when oscillation prevention is disabled
The RTT here, in turn, depends on the pattern of filling and draining the buffers. The Oscillation Prevention (or Reduction) mechanism avoids that such a pattern evolves into an oscillation. Contrast the above with the RTT pattern when oscillation prevention is enabled:
RTT when oscillation prevention is enabled
This is notably quieter, and there is a stabilisation towards a long-term RTT of 220 milliseconds. Hence X_calc/X likewise stabilise:
Computed allowed rate (throughput equation) when oscillation prevention is enabled
The above shows that the algorithm is in principle able to smooth out oscillations.

3. Things to avoid

During further testing, there were two scenarios in which the Oscillation Prevention algorithm showed negative side-effects:
  1. inflating the RTT after slow-start;
  2. prolonging initial slow-start.

3.1  Why is the RTT suddenly so high?

The following side-effect was observed on an 802.11g WiFi link (between a Linksys WRT54G router and a laptop using the Intel 3945ABG chipset with the iwl3945 driver). First, without Oscillation Prevention the computed sending rate X_calc showed the expected behaviour.
Computed sending rate on a WiFi 54Mbps link
The link is able to sustain a DCCP rate of about 23-25 Mbps without problems. But with Oscillation Prevention enabled it was much less:
Wireless 802.11g link with Oscillation Prevention enabled
The reason is an abnormally high RTT (the average RTT under load was 50 msec):
RTT on an 802.11g wireless link with Oscillation Prevention enabled
The only exception in this behaviour is the low between 150 ... 200 milliseconds, coinciding with a lowered loss rate:
The loss rate decays between 150 ... 200 ms
What is the cause for the high RTT? With regard to the above, consider that the algorithm
  • increases t_ipi when the RTT increases;
  • decreases t_ipi when the RTT decreases.
While the former gives the buffers a chance to drain after reaching a high fill level, the latter interacts with
  • a decreasing RTT which indicates that buffers are emptying;
  • this causes the Oscillation Prevention algorithm to release more packets into the network;
  • so that the buffers are soon refilled;
  • this in turn increases the RTT;
  • too extreme increases of the RTT are smoothed out by the algorithm, hence just the average level raises.
Apart from keeping the RTT at a high level, the loss rate (above diagram) also does not really decay. The single exception is the dip between 150 ... 200 msec. After this dip, a state similar to the one before is reached.

3.2 Testing a modified algorithm

To test whether the decrease of t_ipi by the Oscillation Prevention algorithm was the cause of this behaviour, the algorithm was allowed to only increase t_ipi, but not to decrease it. This resulted in the following plots, which confirm the hypothesis:
RTT over 802.11g WiFi link when Oscillation Prevention was constrained to increasing t_ipi only
After the initial slow-start peak, the RTT stabilises at the average of 50 msec. As a consequence, X/X_calc are higher and more stablilised:
802.11g WiFi link when Oscillation Prevention was constrained to increasing t_ipi only
Finally, the loss rate also decays in the expected manner after the initial slow-start peak (compare with above):
Decay of the loss rate on a 802.11g WiFi link when Oscillation Prevention was constrained to increasing t_ipi only

3.3 The modified algorithm from a different perspective

The behaviour of the modified is now shown by zooming in on the slow-start phase, using the same WiFi link as before.

The start-up behaviour without Oscillation Prevention is shown first.
Slow-start when no Oscillation Prevention was used
Contrast this with the unmodified algorithm (Oscillation Prevention allowed to decrease t_ipi):
Oscillation Prevention allowed to decrease t_ipi
The computed rate X_calc (in red, shadowed by X) is unnaturally low and does not correspond to the rate X_recv measured at the receiver.

Next we show the effect of the modified Oscillation Prevention algorithm:
Modified Oscillation Prevention algorithm: not allowed to decrease t_ipi
The response of the modified algorithm is close the first diagram (Oscillation Prevention disabled), and X is now also better correlated with X_recv.

3.2 Interactions of the Oscillation Prevention algorithm with slow-start

The second side-effect of the algorithm was observed during slow-start. All plots in this section were also taken on the same WiFi link.

During slow-start the RTT naturally increases as the buffers are filling up. Once the first overflow causes a packet drop, CCID-3 reduces its sending rate, and the RTT then decays towards its normal average.

A zoom of the RTT during slow-start phase (Oscillation Prevention disabled) is shown below.
Slow-start with Oscillation Prevention disabled
Using the Oscillation Prevention algorithm during this phase has no obvious benefit, and leads to a much slower increase rate:
Naive Oscillation Prevention during slow-start
This can be avoided by disabling Oscillation Prevention during slow-start, as shown in the next plot.
RTT when Oscillation Prevention is disabled during slow-start
The response is now as before, the peak (when the loss occurs) is no longer delayed.

4. Conclusions

This page has presented a practical evaluation of the Oscillation Prevention algorithm, and has discussed two cases where using the algorithm had negative effects.

First, using the algorithm during slow-start delays the instant of the first loss, so that it took longer to reduce the sending rate.

Secondly, allowing the algorithm to decrease t_ipi (when the RTT decreases) had the effect of making things worse by quickly refilling the buffers that had just been emptied, causing a very high RTT on a wireless link.