This document presents issues of
the RTO estimator used by CCID-2, discusses the suggested RFC2988
algorithm, contrasts this with recent research proposals, and analyses
the Linux TCP RTO estimator. From this analysis, it suggests the TCP
estimator as a much better solution.

/* Given a new RTT measurement `RTT' */The primed variables in the above refer to the next values obtained for these variables. We here ignore the clock granularity G (Linux provides a clock granularity of up to 1ms).

if (RTT is the first measurement made on this connection) {

SRTT := RTT

RTTVAR := RTT / 2

RTO := SRTT + max(G, 2 * RTT) /* G is clock granularity in seconds */

} else {

delta := RTT - SRTT

SRTT' := SRTT + 1/8 * delta

RTTVAR' := 3/4 * RTTVAR + 1/4 * |delta|

RTO := SRTT' + max(G, 4 * RTTVAR')

}

SRTT' := SRTT + 1/8 * deltaTwo cases can happen:

RTTVAR' := 3/4 * RTTVAR + 1/4 * delta

RTO' := SRTT + 9/8 * delta + 3 * RTTVAR

- RTO' > RTO
when delta > 8/9 *
RTTVAR

- RTO' <= RTO when delta <= 8/9 * RTTVAR

SRTT' := SRTT + 1/8 * deltaTwo cases can happen:

RTTVAR' := 3/4 * RTTVAR - 1/4 * delta

RTO' := SRTT - 7/8 * delta + 3 * RTTVAR

- RTO' > RTO when delta < -8/7 * RTTVAR
- RTO' <= RTO when delta >= -8/7 * RTTVAR

In [LS00] it is pointed out that when the sampled RTT suddenly decreases by a large amount, the RTO increases similarly (to account for an increased mean deviation). This case occurs when RTT < SRTT - 8/7 * RTTVAR; the increment is

RTO' - RTO = -7/8 * delta - RTTVARThus, when is RTTVAR small, the RTO increase is close to the absolute value of delta.

= 7/8 * (SRTT - RTT) - RTTVAR

When delta falls within the thresholds given in (2a) and (2b) above then the sampled mean deviation decreases, so that convergence towards a lower RTO value is intended. The thresholds depend on the alpha and beta weights (RFC 2988 terminology), the impact of changing these is analysed in [RKS07].

When the RTT sample exceeds the SRTT, and the resulting difference exceeds the threshold given by the mean deviation in (2a), then RTO should also increase.

When the RTT sample dips below the SRTT, an increase of the RTO is counter-intuitive. Below it is shown how Linux neutralises such increases.

/* given a new RTT measurement `RTT' */This code agrees in large parts with RFC 2988 and includes improvements; some of these are described in [SK02]. It was subsequently confirmed that this code agrees with the one used for the analysis and measurements in [RKS07].

RTT := max(RTT, 1) /* 1 jiffy sampling granularity */

if (this is the first RTT measurement) {

SRTT := RTT

mdev := RTT/2

mdev_max := max(RTT/2, 200msec/4)

RTTVAR := mdev_max

rtt_seq := SND.NXT

} else {

SRTT' := SRTT + 1/8 * (RTT - SRTT)

if (RTT < SRTT - mdev)

mdev' := 31/32 * mdev + 1/32 * |RTT - SRTT|

else

mdev' := 3/4 * mdev + 1/4 * |RTT - SRTT|

if (mdev' > mdev_max) {

mdev_max := mdev'

if (mdev_max > RTTVAR)

RTTVAR' := mdev_max

}

if (SND.UNA is `after' rtt_seq) {

if (mdev_max < RTTVAR)

RTTVAR' := 3/4 * RTTVAR + 1/4 * mdev_max

rtt_seq := SND.NXT

mdev_max := 200msec/4

}

}

RTO' := SRTT + 4 * RTTVAR

- RTO = 3 * RTT
when RTT > 100 msec (i.e.,
RTO > 300msec),

- RTO = RTT + 200msec otherwise (i.e., 200msec < RTO <= 300msec).

The potential pitfall here is that the higher sampling rate reduces the variation: RTTVAR goes towards zero as the sampling rate increases. The outcome is that the RTO `collapses' into the RTT so that spurious timeouts become more likely, as pointed out in section 3.2 of [LS00].

The Linux implementation avoids this pitfall by keeping track of the maximum mean deviation, using the rtt_seq, mdev, and mdev_max variables:

- mdev_max is always at least 50msec and only increased when mdev > mdev_max;
- RTTVAR is always greater than or equal to mdev_max.

if (mdev_max < RTTVAR)is in effect only when 50msec <= mdev_max < RTTVAR in between updates of rtt_seq. To illustrate this:

RTTVAR' := 3/4 * RTTVAR + 1/4 * mdev_max

- when the first measurement
is taken, RTTVAR = mdev_max;

- until the first time that SND.UNA is `after' rtt_seq, RTTVAR >= mdev_max >= mdev;
- when SND.UNA is first `after' rtt_seq, RTTVAR stays at its value and mdev_max is reset to 50msec;
- in any subsequent round, until SND.UNA is again `after' rtt_seq,
- if mdev <= mdev_max in between updates of rtt_seq, then mdev_max stays at 50msec,
- when mdev_max has decreased with regard to its previous value stored in RTTVAR, then it is treated as if it were a regular error of RTTVAR that had been sampled once per flight (given by SND.NXT/SND.UNA) and not once per segment.

Frequent decay of the mean deviation due to a higher sampling frequency is blocked, only the long-term effect is tracked.

Linux neutralises this effect as described in [SK02]: whenever RTT < SRTT - mdev, the weights for the error of mdev are adjusted as follows:

mdev' := 31/32 * mdev - 1/32 * deltaWith the simplification that RTTVAR = mdev_max = mdev, we get

RTO' := SRTT + 1/8 * delta + 31/8 * RTTVAR - 1/8 * deltaSince 31/8 is close to 4, RTO' stays close to RTO, and then decays towards lower values.

= SRTT + 31/8 * RTTVAR

However, since mdev' has only an impact on RTTVAR via mdev_max when mdev > mdev_max, it may be that the dip is entirely ignored, so that RTO' converges towards the lower value due to the (longer-term) decrease of SRTT.

The Linux RTO estimator addresses the problems of sampling rate, of not collapsing the RTO into the RTT, and to avoid spikes when the RTT suddenly decreases. It also has fared well in the evaluation against other implementations presented in [RKS07]. It therefore is the RTO implementation of choice for CCID-2.

- [KM05]

- Kesselman, Alexander and Yishay Mansour. Optimizing TCP
Retransmission Timeout.
*Proceedings of The 4th International Conference on Networking (ICN'05)*, volume 2 of*LNCS*, pages 133--140. 2005. Springer.

- [LS00]
- Ludwig, Reiner and Keith Sklower. The Eifel Retransmission Timer.
*ACM SIGCOMM Computer Communication Review*, 30(3):17-27, 7/2000.

- [RKS07]
- Rewaskar, Sushant, Jasleen Kaur and F. Donelson
Smith. A Performance Study of Loss Detection/Recovery in Real-world TCP
Implementations.
*Proceedings of 15th IEEE International Conference on Network Protocols (ICNP-07)*. 2007.

- [SK02]
- Sarolahti, Pasi and Alexey Kuznetsov. Congestion Control in Linux
TCP.
*Proceedings of the FREENIX Track: 2002 USENIX Annual Technical Conference*, pages 49-62. 2002.