CCID 3 Packet Reception

This is the main document summarizing the inner workings of the CCID3 packet receive path. This consists of:

RX history handling
loss detection
RTT sampling
transfer of loss information from RX history to Loss Intervals database

The following UML diagram shows the packages which are updated by this patch set.

Generally, when a new packet comes in, the CCID 3 packet reception algorithm must do the following:

check if a loss has occurred;
perform RTT sampling on data packets as long as no loss has occurred (the RTT is needed to compute the first loss interval);
send feedback periodically, once per RTT;
send feedback when the value of `p' has changed;
update s, p, the number of bytes received since sending the last feedback.

Combining these requirements is not trivial, the following documents each functional aspect of the algorithm.

1. Overview

The following shows the main loop of CCID3 packet reception.

Main Loop Of CCID3 Packet Reception, including loss detection and handling

2. General Principles

If no packet has been received so far, the algorithm merely updates the entry with the highest received sequence number so far (called `nonloss.S' in the flowchart). It keeps a counter which counts the number of packets since last a packet was detected to be missing.

Detecting a missing packet works by comparing the current sequence number against the highest received sequence number so far. We use a function dist(A, B) which returns the sequence number distance between sequence numbers A and B as a signed number. This makes it possible to detect re-ordering and duplicates: if dist(A,B) <= 0 then the packet with sequence number B is either re-ordered (`before' A) or B==A.

Now, a loss indication is given when D = dist(nonloss.S, S) > 1 and the NDP count N of the current packet is such that N < D. Missing non-data packets are not considered; this is in accordance with RFC 4342. The effect of a loss indication is that the highest received sequence number obtained up till then is stored. Flagging a new loss is accomplished using index manipulation:

loss_count = 1

loss_prev = hist[0]

nonloss.S = hist[loss_count]

This ensures that the highest received sequence number received before the loss is stored in hist[0], while hist[1] has the overall highest-received sequence number so far (which is always updated). The loss_count is incremented each time a packet is received whose sequence number S' is such that dist(loss_prev.S, S') > 0. When loss_count = NDUPACK = 3, then the pending loss is promoted to a loss event.

3. Loss Detection

Loss detection needs to

be robust against packet reordering;
detect when holes are filled due to late arrival;
handle bursts of losses;
determine whether two losses belong to the same loss interval.

Comprehensive details can be found here.

4. Handling Data Packets and Sending Feedback

Initial feedback is sent when the first data packet arrives. The payload length of data packets is always tracked in the CCID 3 parameter `s'; also the number of bytes received in between feedbacks (variable bytes_rcvd) is kept up-to-date. When a loss is pending (1 <= loss_count < 3), the receiver does not recompute p or sample RTT values. This is for two reasons so:

the missing packet(s) may come in late, due to delay or re-ordering;
when packets are missing, comparing window counter fields (CCVal) is no longer reliable - the counter may have wrapped around the just 16 values in the meantime.

Computing the value of X_recv contained in feedback packets is described later, in section 8.

5. Sampling RTT Values

As long as the loss interval history is empty, p is 0 and hence need not be recomputed. On the other hand, a RTT value is needed to compute the length of the first loss interval (RFC 3448, 6.3.1). Since a CCID 3 receiver only has the CCVal field as RTT information, it must try to obtain an RTT estimate before the first loss occurs, cf. RFC 4342, 8.1. The finer details are here.

6. Maintaining the Loss Intervals Database

The TFRC specification suggests an implementation with a fixed number of n=8 entries, where the oldest entry is overwritten when a new entry is added at the front. This calls for a ring-buffer implementation which is maintained in FIFO manner. It is again realised on top of a fixed array, using index arithmetic. The basic principle is simple, details of a user-space implementation can be found here. The actual in-kernel implementation is a bit more sophisticated: it uses on-demand allocation so that applications running on loss-free links require less memory.

7. Distinguishing Data and Non-Data packets

RFC 4342 introduces the concept of data length of a loss interval, which is the sequence length of a loss interval "minus the number of non-data packets the sender transmitted during the loss interval" (sec. 6.1.1).

For the receiver, computing the data length of a loss interval constitutes a challenge in that the data length can only be reliably computed if certain conditions hold; in most cases the receiver is at best able to compute a lower bound as an approximation of the data length, but will lack the information to compute it exactly.

The RFC suggests that "the receiver can account for received non-data packets by not including them in the data length, and for packets that were not received, it may be able to discriminate between lost data packets and lost non-data packets using DCCP's NDP Count option.the receiver can account for received non-data packets" (ibid.)

Hence the following necessary condition must be satisfied for a receiver to compute the loss interval data length exactly:

For each `hole' in the loss interval's lossy part, defined by the highest received sequence number S_i before the loss and the first packet, with sequence number S_j and NDP count N_j, received after the loss: N_j must account exactly for all non-data packets sent between S_i and S_j.

If this is not the case, then N_j is at best a lower bound for the number of non-data packets between S_i and S_j. This is in particular the case when the distance from S_i to S_j is large (more than one packet loss), e.g. due to a link failure or prolonged burst.

The receiver can not distinguish between single and multiple losses (RFC 4342, p. 25); the only information it has available is the NDP count N_j of the first packet received after the loss.

This uncertainty increases proportionally with the number of losses occurring in one loss interval, since the total number of non-data packets in that interval will sum all such NDP counts N_j.

The design decision has therefore been to use the conservative choice of assuming that the NDP count in each loss interval is always equal to zero. This policy is in agreement with RFC 4342: "In the absence of better information, a receiver MUST conservatively assume that every lost packet was a data packet and thus must occur in some lossy part" (p. 12).

In addition, there are two more reasons supporting this design choice. Firstly, all situations where the connection from sender to receiver consists mainly of streaming data packets, the NDP count will de facto be low or even zero. Secondly, in order to approximate a lower bound on the NDP count in a loss interval, the receiver needs to track NDP and packet type information throughout each of the three subsystems (1) CCID3 packet reception, (2) receiver history, and (3) loss interval history. The added implementation complexity that this requires does not seem to be adequately matched by the gains of obtaining an estimated lower bound of the NDP count.

However, it may make sense to upgrade the implementation at a later stage, if experience suggests that connections involving a large fraction of non-data packets are not an exception. As a greatly exaggerrated worst-case analysis, suppose that in each loss interval the NDP counts amounts to 50% of the interval length. Then an I_mean based on a zero NDP count will be twice as large than if it were based on the actual interval data length, resulting in half the value of the estimated loss event rate p = 1/I_mean. The upshot would be that the sender reacts more slowly to congestion.

On the other hand, an NDP count of 50% per each loss interval is a practically pathological. Therefore, we stick for the moment with the conservative assumption from above, leaving other choices as future implementation options.

8. Computing X_recv

This section describes computation of X_recv wrt [RFC 4342, 10.3] and [RFC 3448, 6.], using the following terminology from [RFC 3448, 6.2]:

S_m: The sequence number of the last data packet that has been received.
R_m: The last RTT estimate, taken at the time packet S_m was received.

We further refer to the last_counter window counter CCVal variable defined in [RFC 4342, 10.3]. CCID3 differs from TFRC in that (i) the sender does not provide explicit RTT estimates to the receiver and that therefore (ii) the receiver has no feedback timer. On the other hand, CCID3 conforms to TFRC, and so the relationship needs to be established; which is done in the following sequence:

Periodic feedback is sent whenever the mod-16 distance from last_counter to the current packet's CCVal window counter value exceeds 3.
In the absence of other/better information, we have to assume that the time interval between two packets which satisfy (1) is a (close) approximation of one RTT; and hence we equate this time interval with R_m. This means that X_recv will be calculated as the number of bytes received since last_counter was last set, divided by R_m, the RTT estimate given by the arrival of a data packet satisfying (1). To keep track of the number of bytes received since the last feedback was sent, a variable bytes_rcvd is used.
However, when sending feedback due to a change in parameters (new loss occurs or p > p_prev), we do not have dependable information about the length of an RTT: in particular, the loss could have occurred immediately after sending the last feedback packet. Hence we can not compute X_recv reliably, and therefore the best solution is to reuse the previous value of X_recv, as computed at the time of last sending periodic feedback.
Finally, the receiver-cached value of X_recv in (3) may be zero - e.g. due a loss occurring early during a connection, after sending initial feedback as per RFC 3448, 6.3. There are two ways of solving this problem (of which the second one is used):

send feedback with p > 0 and X_recv = 0 (the previous value). At the sender, this would immediately bring down the sending rate to s/t_mbi (cf. RFC 3448, 4.3);
in the absence of better information, consider the time interval since last_counter was set (i.e. since the last feedback packet was sent) as the best possible approximation of R_m and compute X_recv over this time interval. This is still crude, but seems safe to do since the sending rate is limited above by X_calc, no matter how high the estimate of X_recv is.

To summarise: The byte counter (bytes_rcvd) is always reset whenever feedback is sent (periodic or otherwise), at the same last_counter is updated. Hence, when a packet satisfying (1) arrives, we have counted the number of bytes over the period of last setting last_counter; this period serves as an approximation of R_m; and in this manner the computation of X_recv conforms to step (2) of [RFC 3448, 6.2].