Evaluation of a Rate Mismatch Controller for CCID-3

This text describes the implementation of the Rate Mismatch Controller developed by L. De Cicco and S. Mascolo in their 2009 ICNP paper

Luca De Cicco, Saverio Mascolo
"A Mismatch Controller for Implementing High-Speed Rate-based Transport Protocols"
Proceedings of ICNP 2009, pp. 244 - 253
Princeton, NJ, USA, October 2009, DOI: : 10.1109/ICNP.2009.5339678

Since the paper concerns problems of RFC 5348, it seemed worth the effort of adapting and testing. The initial hope was this would allow to dispense with high-res timers.

1. Integration of the algorithm into the Linux CCID-3 implementation

Details on how the algorithm was adapted for Linux (in particular moving from an asynchronous to a synchronous implementation and avoiding problems such as overflow), are described in a separate text document here.

The initial algorithm is available as a single patch, described in the next section.

2. Test setup: sanity tests

In order to compare "old" and "new" send control implementation, a module parameter 'X_set' was added which, when set to a value > 0, fixes the allowed sending rate X_Bps to the given value (in bits-per-second), completely bypassing the usual rate control of the sender.

For given expected sending rates X_set in {100kbps, 1Mbps, 2Mbps, 5Mbps, 10Mbps, 20Mbps, 50Mbps, 100Mbps, 200Mbps}, the effective sending rates were then measured in user-space using simple iperf connections on the localhost.

The test setup thus consisted of three parts:

A patch to pass the expected sending rate xset to the CCID-3 module.
The proof-of-concept implementation of the Rate Mismatch Controller.
A script to run the tests on the local host.

The patches (1,2) apply on top of the test tree available at git://eden-feed.erg.abdn.ac.uk/dccp_exp [subtree 'dccp']

The test machine was a Thinkpad T60 1.83Ghz, all iperf test runs reported in this documented had a duration of 20 seconds.

3. Reference data: original implementation

In the first step the X_set patch is used, evaluating the effective sending rate achieved with the existing implementation under different conditions.

3a) Using different values of HZ with high-res timers

The current implementation of CCID-3 relies on high-resolution timers to function properly and accurately.

The actual values were directly copied from iperf output, which does smart rounding, i.e. 101 M should read 101.00 Mbps.

X_set	HZ=100	HZ=250	HZ=300	HZ=1000
100 kbps	107 k	107 k	107 k	107 k
1 Mbps	1.01 M	1.01 M	1.01 M	1.01 M
2 Mbps	2.02 M	2.02 M	2.02 M	2.02 M
5 Mbps	5.04 M	5.04 M	5.04 M	5.04 M
10 Mbps	10.1 M	10.1 M	10.1 M	10.1 M
20 Mbps	20.1 M	20.1 M	20.1 M	20.1 M
50 Mbps	50.5 M	50.5 M	50.5 M	50.5 M
100 Mbps	101 M	101 M	101 M	101 M
200 Mbps	203 M	203 M	203 M	203 M

The results are as expected: since high-resolution timers are used as clock source, the achieved rates remain independent of the selected value for HZ.

3b) Using different values of HZ with clocksource=jiffies

The normal DCCP code does not allow to use CCID-3 with a low-resolution clocksource, since this causes RTTs to be rounded down, which in turn triggers bugs resulting from RTT values lower than the time resolution. The patch bypasses this problem by rounding a zero RTT up to 1.

With this trick the second experiment tested the sensitivity of the current implementation to changes in clock resolution, by using clocksource=jiffies in the kernel commandline and recompiling the kernel with different values of HZ.

X_set	HZ=100	HZ=250	HZ=300	HZ=1000
100 kbps	232 k	183 k	107 k	107 k
1 Mbps	1.07 M	1.04 M	1.01 M	1.03 M
2 Mbps	2.08 M	2.04 M	2.03 M	2.04 M
5 Mbps	5.09 M	5.15 M	5.05 M	5.06 M
10 Mbps	10.1 M	10.1 M	10.1 M	10.1 M
20 Mbps	20.2 M	20.2 M	20.1 M	20.2 M
50 Mbps	50.5 M	50.5 M	50.5 M	50.5 M
100 Mbps	101 M	101 M	101 M	101 M
200 Mbps	203 M	203 M	203 M	203 M

This shows that even with a low clock resolution, comparable results can be achieved. The differences are notable only for coarse resolution (HZ < 300) and small sending speeds.

4. Rate Mismatch Control: jiffies or ktime_t?

The vision of dropping high-res timer dependency in CCID-3 was the initial drive behind these tests. But to see how effectively this can be done, the Rate Mismatch Controller algorithm was tested both with low and high resolution timers.

4a) RMC with simplest timebase: jiffies

This test only used jiffy differences (converted to milliseconds) as timebase; corresponding to #define _RMC_USE_JIFFIES 1 in the patch.

X_set	HZ=100	HZ=250	HZ=300	HZ=1000
100 kbps	112 k	112 k	111k	109 k
1 Mbps	1.01 M	1.01 M	986 k	1.01 M
2 Mbps	2.01 M	2.01 M	1.88 M	2.01 M
5 Mbps	5.01 M	5.01 M	4.59 M	5.01 M
10 Mbps	10.0 M	10.0 M	9.14 M	10.0 M
20 Mbps	20.0 M	20.0 M	18.4 M	20.0 M
50 Mbps	50.0 M	50.0 M	46.1 M	50.0 M
100 Mbps	99.8 M	99.9 M	91.5 M	100 M
200 Mbps	200 M	200 M	184 M	200 M

Most of the effective sending rates matched the expected values. The exception is HZ=300,where the error can be as high as 8.5%. This prompted further tests.

4b) RMC with ktime_t + high-res timers

The question was whether ktime_t + high-resolution timers improved the accuracy of RMC, as in fact it did (#define _RMC_USE_JIFFIES 0 in the patch):

X_set	HZ=100	HZ=250	HZ=300	HZ=1000
100 kbps	112 k	111 k	112 k	112 k
1 Mbps	1.01 M	1.01 M	1.01 M	1.01 M
2 Mbps	2.01 M	2.01 M	2.01 M	2.01 M
5 Mbps	5.00 M	5.01 M	5.01 M	5.01 M
10 Mbps	10.0 M	10.0 M	10.0 M	10.0 M
20 Mbps	20.0 M	20.0 M	20.0 M	20.0 M
50 Mbps	50.0 M	50.0 M	50.0 M	50.0 M
100 Mbps	99.8 M	100 M	99.9 M	99.9 M
200 Mbps	200 M	200 M	200 M	200 M

The results are consistent: accuracy and reliability are bought at the expense of high-res timers - as in the current implementation.

4c) RMC with ktime_t + clocksource=jiffies

The final test is analogous to the second one made for the current implementation, to evaluate the sensitivity towards clock resolution.

X_set	HZ=100	HZ=250	HZ=300	HZ=1000
100 kbps	1.4 M	623 k	534 k	357 k
1 Mbps	2.31 M	1.52 M	1.44 M	614 M
2 Mbps	3.31 M	2.52 M	2.43 M	586 M
5 Mbps	6.30 M	5.52 M	5.44 M	595 M
10 Mbps	11.3 M	10.5 M	10.4 M	594 M
20 Mbps	21.3 M	20.5 M	20.4 M	20.3 M
50 Mbps	51.2 M	50.4 M	50.4 M	593 M
100 Mbps	101 M	100 M	100 M	590 M
200 Mbps	201 M	200 M	200 M	600 M

This shows a higher dependency on high-res timers than the current implementation. For HZ=1000 and clocksource=jiffies, the controller is no longer able to keep the output under control. A similar case happened in a separate test for X_set = 100 Mbps and HZ=100 where an effective sending rate of 396 Mbps was observed.

5. Conclusion

This text has discussed a proof-of-concept implementation of the Rate Mismatch Controller and done some simple sanity tests.

The key results of the evaluation are

the current implementation is less sensitive to clock resolution than previously believed
the RMC implementation is much more sensitive to clock resolution than anticipated.

Perhaps there are things that can be made better in the implementation, but the main problem is that the mismatch controller can only be as accurate as its input, i.e. inaccuracies of time measurement propagate as inaccuracies of output sending rate. Hence, the RMC still needs high-resolution timers (or extremely long sampling times to filter out errors).

The algorithm has interesting theoretical possibilities but as it seems, these only make sense when implemented in conjunction with high-resolution timers. Hence for the moment there is no clear advantage in using the algorithm. In addition, it can not be used to directly replace the one in RFC 5348, 4.6, since points such as limiting the burst size and keeping the algorithm responsive to changes in the available bandwidth are not considered in the current, basic form of the algorithm (a tentative list is at the end of this text).