Problems with TFRC packet scheduling
            ====================================

We are having implementation difficulties to make the packet scheduling in CCID 3 both
working smoothly _and_ up to spec with regard to [RFC 3448].

Three problems have been identified and are described in detail below. They
can be summarized as two conceptual problems:

 (a) Packet scheduling can only be controlled up to a maximum rate
     which depends on the scheduling granularity of the OS; beyond this
     rate, packets are sent uncontrolled (at top speed);

 (b) The system as described in [RFC 3448, 4.6] works like a Token Bucket Filter
     with an undefined bucket size; the absence of a defined bucket size leads
     to unbounded `credits' which entail arbitrarily large bursts.


PROBLEM I: When basing rate regulation on OS packet scheduling, the
           granularity t_gran of the OS scheduler imposes a maximum
           controllable speed X_crit. Beyond this X_crit, the sending
           rate can no longer be controlled and solely depends on the
           capabilities of the hardware.

Cause for Problem I:
--------------------
 Assume that no loss occurs. Then X = max(min(2*X, 2*X_recv), s/R) and so

                t_ipi    =    s/X    <=    R

 Let t_gran be the OS scheduling granularity (in Linux, t_gran = 1/HZ).
 When using the mechanism described in [RFC 3448, 4.6], rate control is
 based on delaying packet sending by putting the application to sleep for
 up to t_ipi time units. When t_ipi < t_gran, there are two possibilities:

     (i)  not let the application sleep             -> send immediately
     (ii) conservatively assume that t_ipi = t_gran -> always delay

 Both choices occur at the same point, characterised by t_ipi = t_gran.
 From t_ipi  =  s/X_crit  =  t_gran  we obtain a `critical speed' as

         X_crit =   s/t_gran             (in Linux: X_crit = s * HZ)

 When going faster than X_crit, t_ipi < t_gran; when going slower then
 t_ipi > t_gran, which is not critical.

 Consequences:
   (i)  using case (i) above means that there is no control beyond X_crit
   (ii) using case (ii) above means that X is always bounded by X_crit

 To illustrate with a numerical example: when t_gran = 1ms (which is `good'),
 then a packet size s of 32 bytes means an X_crit of 256 kilobits/sec; and
 a packet size of 1460 bytes means an X_crit of 11.68 megabits/sec.


The following two problems are related and have been identified by Ian
McDonald. They were posted a while before and illustrate a second problem
of the OS-based packet scheduling. They are important and there is a clear
need for further clarification.

PROBLEM II: The application sends a packet at time t_0 and goes idle
            for a duration t_idle >= 2 * t_ipi. When resuming activity,
            the application is then permitted to send a burst of
            beta = floor(t_idle / t_ipi) - 1 packets immediately.
            The problem is that the burst size beta is unbounded.

Cause for Problem II:
---------------------
 Packet sending times, as described in RFC 3448, are calculated
 from the previous sending time and the value of t_ipi. When idling for
 too long, this accrues a `negative credit', which can be calculated as
 follows.
 Assume that the last packet before the idle phase was sent at time t_0.
 Then, according to [RFC 3448, 4.6], the next sending time t_1 has the
 nominal value t_1  =  t_0 + t_ipi. Let t_idle = n * t_ipi + t_rem be the
 the time during which no packets are emitted; with t_rem < t_ipi.
 When the application resumes sending activity at time t_now = t_0 + t_idle,
 it can send a packet immediately, since

     t_now - delta  =  t_1 + (n-1) * t_ipi + t_rem - delta  >  t_1

 Assuming for the moment that t_ipi stays constant, it takes n-1 iterations
 to clear this negative credit before normal (i.e. delayed) packet scheduling
 resumes.


PROBLEM III: Application emits packets at a rate alpha which is less than
             the sending rate X currently permitted by TFRC:
                       alpha  <  X/s

Consequences of Problem III:
----------------------------
 In this case, since 1/alpha > s/X = t_ipi, a `credit' similar to the one
 in Problem II accrues; accumulating in increments of 1/alpha - t_ipi each
 time a packet is sent out. The problematic case arises when the application
 increases its rate alpha by some drastic amount: then a burst of length
 proportional to the accrued credit can be sent without any delaying.


        Combined Analysis for Problems II and III
        -----------------------------------------
 Problems II/III can be described in terms of a Token Bucket Filter with
 bucket size beta equal to the accrued credit of packet backlog.
 As in the standard Token Bucket Filter, tokens are placed at a rate
 rho = t_ipi in the bucket. Under `normal' conditions, 0 <= beta <= 1; i.e.
 at most one packet is sent at each scheduled time. When however beta > 1,
 as is the case in Problem II/III, beta packets can be emitted immediately.

 The problem here is that there is no defined upper bound for beta, it can
 grow arbitrarily large. This includes both the possible cases of mere
 TCP-unfriendliness as well as possible network overload due to congestion.

 The challenge is now to find an upper bound for beta. When the sender is
 in slow-start, X >= s/R is not bounded and can not be used as upper limit
 on the burst length.

 Another idea would be to set beta = MSS/s.