CCID-3 cutting off on poorer link conditions


This document discusses a case where CCID-3 completely cuts off a wireless connection.

1.  Setup

Audio streaming (mp3 and ogg) was conducted using paraslash over the 2.4GHz ISM band.

The access point was a 2GHz AMD Mobile Sempron 3700+ with a D-Link System DWL-G122 802.11g  USB Adapter (ralink rt73), using hostapd as access point manager. The receiver was a Lenovo T60P with Intel Core Due T2400 1.83Ghz and 3945ABG (iwl3945) wireless interface.

Both systems had a 2.6.35+ kernel using the patches from the DCCP test tree at git://eden-feed.erg.abdn.ac.uk.

To get the most out of the connection, partial checksum coverage (only covering the header, not the payload), plus FEC encoding (groups of 30 packets containing 10 data packets each, utilizing the maximum packet size) were used.

The receiver was located close to the sender: between 20 centimeters and 2.5 meters.

2. CCID-3 behaviour

Over the period of 10-30 minutes the streaming went without big problems, irrespective of format (mp3 use small packets, ogg uses 'pages' of about 4k each). The worst disruption in this period were short-time interferences, evidently caused by corrupted payloads (passed to the audio decoder thanks to partial checksums).

Then suddenly a longer-term disturbance occurred, starting initially with holes in the audio stream which became longer, until finally completely cutting off.

The following passages describe the condition and its causes in detail. The disturbance startet at 650 sec and lasted until about 850 sec.  The plots show an improvement afterwards, which was caused by moving the receiver from its distance of about 2.5 meters (including a thin brick wall)  very close to the sender (about 20-30 centimeters).

2.1 Loss rate

The first thought was an increase in the loss rate, which indeed occurred to some extent as the following plot of the connection shows.
Loss rate of the connection

However, this was nothing too much of the ordinary. The link had been observed for longer time and was known to exhibit a typical TFRC loss rate in the range of 1% .. 10%.

2.2 Transmit rate

As shown in the following figure, the transmit rate dropped down to virtually 0 during this period.
Transmit rate
Since the loss rate p was > 0  during the connection, the allowed transmit rate at this stage was determined mostly by X_calc, rather than X_recv . But surely, "merely" 10% loss could not have cut down the rate that much.

2.3 Evolution of RTT

The culprit for this behaviour was found in the RTT, which during this period also increased, from an average of about 1..2 milliseconds to 2.5 seconds.

RTT peaking at up to 2.5 seconds

2.4 The contribution of CCID-3 to bad performance

The cause of the suddenly high RTT is not totally clear. It is very likely that link-layer retransmissions, plus binary backoff due to contention at the MAC layer were involved. In the present case two or three other access points were active in the neighbourhood. Other possible sources of interference are microwave ovens, DECT wireless phones, and bluetooth equipment.

In any case, the observed condition is a reproducible combination of  link layer conditions  which accompany a wireless link disturbance.

What makes the performance especially bad is that CCID-3 then starts to cut off. The combined increase of loss rate and RTT  then effectively kills the whole audio stream.  This is further illustrated by looking at the plot of the computed sending rate X_calc = f(1/(p * RTT)).
Computed allowed transmit rate X_calc

Zooming in on the period between 650 and 850 seconds shows that X_calc went down to zero, blocking transmission completely for about 200 seconds.

The wireless equipment had previously been tested with UDP and TCP (http) streaming. In both cases the streaming performance more quickly recovered from disturbances: in particular there was not such a drastic knock-on effect as seen here with CCID-3.

(As an aside, CCID-3  was roughly correct in gauging the link speed: the 802.11g link is specified for as 54 Mbps, the average of 50-60 Mbps in the above plot is not too far.)

3. What to do about it?

It is desirable that the transport protocol maximizes performance even under poor link layer conditions. In any case, having an extra penalty for an already poor connection is not a good idea.

The problem is a real-world one, since nowadays ethernet cables are being replaced by wireless links and the accompanying interference (especially in densely populated urban
areas, where multitudes of access points are in close vicinity).

Granted, CCID-3 was not developed with wireless links in mind.

The performance may still improve somewhat when converting the Linux code to fully comply to RFC 5348 (at the moment it is still at the stage of rfc3448bis). However, there are serious doubts that this will solve the "cut off" problem. Doubts therefore remain whether one really should use vanilla CCID-3 over WiFi.

A wireless extension to TFRC, such as the one proposed by TFRC Veno, seems a more promising approach.