Running CCID-3 over loopback/localhost

There is a notable peak starting with the first packet, afterwards the RTT converged to the actual link RTT. The reason for the peak is that the first RTT measurement is taken from the Request/Response exchange. Since packet exchanges for connection setup take slightly longer than those in steady state, a higher value results. The slow decline is due to the fact that an exponential weighted moving average is used: it takes longer to converge to the actual link RTT of 3 microseconds.

The RTT changes coincide with the changes of X:

Unmodified loopback: Allowed sending rate X

The loopback interface has an MTU of 16436 bytes. With regard to the RFC3390 initial sending rate this means for the packet size s=22:

W_init/RTT  =  min(4*22, max(2*22, 4380)) bytes / 63usec = 88bytes / 63usec  =  11.17 Mbps

which is the value of the first point plotted above.

CCID-3 then performs a kind of slow-start until about 2 seconds. This is not a crystal-clear case of slow-start, so a more detailed analysis is necessary.

The Linux implementation uses the revision 00 of draft rfc3448bis, where an idle sender is defined as one which hasn't sent anything for more than 2 RTTs. Since the RTT is low in this case, in each computation of X the sender is classified as "idle" and gets the benefit of being constrained by the RFC 3390 initial rate rather than 2 * X_recv.

	min_rate = max(W_init/RTT, 2 * X_recv)		= 	max(4*22*8/RTT, 2*X_recv)
	X	 = max(min(2*X, min_rate), s/RTT)	= 	max(min(2*X, min_rate), 22*8/RTT)

Considering that since the sender is application limited and that therefore we have a near-constant X_recv equalling the application-sending rate of 1800 bits per second, we get the following simplification:

	X = max(min(2*X, max(W_init/RTT, 2*X_recv), s/RTT)
	  = max(min(2*X, max(4*s/RTT, 2*X_recv) s/RTT)
	  = max(min(2*X, max(4*22*8/RTT, 1800), 22*8/RTT)

The inner `max' simplifies to W_init/RTT since this is greater than 1800 for all RTTs less than 391 msec. Hence

	X = max(min(2*X, 4*s/RTT),       s/RTT)
	  = max(min(2*X, 4*22*8/RTT), 22*8/RTT)

This means that a "classic" slow-start (where X' = 2*X) is performed only as long as

	s/RTT  <  2*X  <  4*s/RTT = W_init/RTT

which, with the initial RTT of 63 usec, gives the interval 1.4 Mbps < X < 5.6 Mbps. This is only a narrow stretch in the above diagram (in the dccp_probe log, this doubling of the sending rate happened between 0.826 seconds and 1.022 seconds). As long as X <= 2*s/RTT, the s/RTT term dominates so that X = s/RTT (in the log this happened from the begin up to 0.826 seconds). Lastly, consider the stage where 2 * X >= W_init/RTT: since W_init/RTT is always greater than s/RTT, at this stage the value of X solely depends on W_init/RTT (in the log this happens from 1.022 seconds onwards).

The end of this phase was reached when the RTT converged towards the actual link RTT, peaking at about 2 seconds. This peak value is 234.7 Mbps (reached at 2.391 seconds) and it is simply 4*s/RTT with the link RTT of 3 microseconds, i.e. 4*22*8/3 Mbps.

The t_ipi plot is inverse to the X plot:

This again shows the problem: after reaching the peak at about 2 seconds, the allowed number of packets per second gradually declines; in the reported case it eventually dropped below the minimum required number of one packet per 64 seconds (which corresponds to 0.016 packets per second). This "dead" minimum is not reached in the above plot, but is an asymptotic value.

3. What is the explanation?

Summarising the above conditions:

the application sends at a constant bitrate of 1.8 kbps, using 22-byte packets each, resulting in a constant inter-packet gap of 98 milliseconds;
the steady-state RTT is 3 microseconds;
there is no loss, i.e. p = 0.

Two culprits were found for this behaviour: frequent expiry of the nofeedback timer and absence of receiver feedback after the RTT value had reached the link RTT of 3 microseconds.

3.1 Reason for decline of X

The reason for the decline of X (and hence the increase of t_ipi) is the nofeedback timer: according to RFC 3448, 4.3, each time a feedback packet is received, the nofeedback timer is set to expire after

max(4*RTT, 2*s/X)  =  max(12, 2*22*8/X)

microseconds. When it expires, X is halved by modifying the cached value of X_recv. The revision 00 of rfc3448bis had the additional rule whereby X is halved directly if either (i) no feedback has been received yet or (ii) p == 0, according to

X = max(X/2, s/t_mbi)

where s/t_mbi is the asymptotic value of 1 packet in 64 seconds. Since in the given scenario p == 0, this is what happens (and this rule is still in place in revision 05 of rfc3448bis).

The nofeedback timer will expire every 12 microseconds as long as s/X < 6 microseconds, or as long as X > 29.34 Mbps. When X has been reduced to lower than 29.34 Mbps, then s/X < 2*RTT and thus the next timer expiry happens at 2*s/X. In the X diagram above, the first time that X reaches a value of 29.34 Mbps between 1.02 and 1.11 seconds. But this is still during the semi slow-start phase, where X is increasing.

The second time that X passes 29.34 Mbps is at about 2.68 seconds into the trace, where it already has been halved three times (relative to the peak value of 234.7 Mbps). At this time X was 29.3 Mbps, so the next expiry is at about 12 microseconds later. After halving, X has a value of 14.7 Mbps, the next expiry of the nofeedback timer is thus after 2*22*8/14.7 = 24 microseconds, then leading to further expiries at 48, 96, 192, ... microseconds.

An astute observer will find such a rapid decline of X does not happen in the diagram. The reason for this is that the Linux implementation has clamped the nofeedback expiry interval to a minimum user-configurable value which defaults to 100 milliseconds:

actual_expiry_interval  =  max(user_config_value, max(4*RTT, 2*s/X))

This default value is incidentally near the inter-packet gap (98 milliseconds) of the sending application. Hence halving of the sending rate occurs approximately at sending time intervals.

A slow-down will now occur when t_ipi=s/X > 50msec. This happens when X is less than 3.52 kbps and can be observed in all the above diagrams as a change in the gradient of the increase after about 4 seconds into the trace. Consider the X plot above: sampling points are taken each time a new datagram is sent (i.e. every 98 milliseconds); after 4 seconds, the gradient changes and one can see that often two consecutive samples still have the same X rate, indicating that X now decreases at a slower rate.

3.2 Why is there no longer any feedback?

X will be halved as long as there is no feedback. And indeed, in the trace the receiver stopped sending feedback approximately at the time when the EWMA of the RTT had converged to the actual link RTT of 3 microseconds.

The reason for this is in the window-counter algorithm of RFC 4342: the receiver only sends feedback when the CCVal changes, and the algorithm uses the following expression:

quarter_RTTs = floor((current_time - last_WC_time) / (RTT/4))

Since the implementation uses microsecond resolution, a divide-by-zero would occur for any RTT smaller than 4 microseconds. Hence there is no feedback, and therefore the nofeedback timer continues to halve the sending rate X until reaching the asymptotic value of s/t_mbi.

4. Counter-measure

The obvious thing to do is, since the window counter algorithm does not work for RTTs smaller than 4 microseconds, to clamp all RTT samples to a value greater than 4 microseconds.

This is shown below to successfully work, tested by artificially inflating the RTT. Subsequently an equilibrium condition is computed.

4.1 Reaching stability by artificially inflating the RTT

Experiments were performed where the RTT was artificially inflated by adding 0.5..1msec link delay as described in section 1.

Allowed sending rate X when RTT was inflated

The steps can be traced in the 1ms log and 2ms log. The following graphs are a consequence of this stabilistation (as they derive from X).

Inter-packet-interval t_ipi when RTT was inflated

Number of packets per second allowed by CCID-3 when RTT was inflated

4.2 Computing an equilibrium condition

The particulars of the given situation are:

The inter-packet gap IPG of the sending application is such that IPG > 2*RTT. The sender thus qualifies as "idle between send()s" according to draft rfc3448bis-00, so that the new value X' is computed as
X' = max(min(2*X, max(W_init/RTT, 2*X_recv), s/RTT)
The sending rate is application-limited, i.e. X_send < s/RTT. The receiver reports this as X_recv = X_send < s/RTT.
Since X_recv < s/RTT, we can simplify
X' = max(min(2*X, W_init/RTT), s/RTT)

This can be considered as equation for a series, the value towards it converges depends on the starting value:

When the start value X >= W_init/(2*RTT) then the series converges towards W_init/RTT.
When the start value X < s/(2*RTT) then the series converges towards s/RTT.

What has not been considered is that the nofeedback timer will also interfere with the processing when 4*RTT < 100ms <= IPG (which is the case above), so that case (B) is likely and observable in the above diagrams: s/RTT is 176 kbps for an RTT of 1 msec (0.5 msec added delay) and 88 kbps for an RTT of 2 msec (1 msec added delay).