Feature negotiation implementation notes

The solution chosen for the implementation are to support feature negotiation during connection setup, according to a scheme worked out in detail here, but to not support feature negotiation in the middle of an established connection at the moment.

1. Basic considerations

The following reasons have lead to discouraging support for anytime-negotiation of traffic parameters. In the subsequent section, we introduce an exception to this decision, to support passing of parameters during an established connection.

1.1 Why not support changing of parameters in the middle of a connection

The distinction is between the setup of a connection and a fully established connection.

In a fully established connection both endpoints have performed feature negotiation already and each endpoint, for every feature affecting the two half-connections, is in the STABLE state (6.6.2) of feature negotiation. During connection setup, endpoints either chose to use the defaults (6.4), or perform initial feature negotiation. Initial feature negotiation is not specified at great detail in RFC 4340, the implications are worked out here. The initial feature negotiation ensures that both endpoints enter STABLE state for every feature before the actual data transfer begins.

For already-established connections, changing traffic parameters means to leave the STABLE state for one or more features affecting the data transfer, and to re-enter the feature negotiation phase. Since feature parameters affect the way data is transported, one needs to disable data transfer during the period that feature negotiation is performed (see the discussion of initial feature negotiation for examples). As a consequence, sufficient buffering capacity needs to be provided to bridge this period.

Secondly
, if packets carrying Change options get lost, additional error handling is required for the case where one endpoint has suspended traffic due to entering its negotiation phase, while the other endpoint is still ignorant of this change and continues to send data traffic (possibly at a high rate).

Thirdly
, retransmission of feature-negotiation packets requires the integration of another retransmission timer into the existing framework. The retransmissions need to consider both the current protocol state (e.g. avoiding activity when the timer fires in the CLOSING state) as well as the activity of the existing set of timers (such as the delayed-ack timer), which is quite complex to do.

Fourth, feature negotiation packets are controlled by the TX CCID, which causes additional considerations when
The consequence is again an increased demand for buffering capacity for the period during which feature negotiation is performed.

Lastly, if the TX queue is still full at the instant of entering the feature negotiation, then another problem arises: before activating the new parameters (which are intended for the future packets, not the past packets still sitting in the TX queue), the TX queue may have to be drained. This could either take a long while (and thus unduly extending the feature-negotiation phase), or it would require to equipt the sender/receiver sides with two different sets of rules - one for the `old' packets (belonging to the pre-negotiation phase), and another for the `new' packets (belonging to the post-negotiation phase). This makes processing very complex.

The conclusion, therefore, is that while switching traffic parameters during an established connection may not be entirely impossible, it is very complex to support; and hence a simpler solution is desirable.

If an application really must change parameters in the middle of a connection, an alternative (which always exists) is to tear down the old connection and begin another one which uses the new parameters. This takes only a few lines of application code.

1.2 Why support for changing parameters is still necessary

There is one reason to make an exception to the above, and this is the Ack Ratio feature of CCID2: the specification relies on the feature-negotiation mechanism to pass Ack Ratio changes from sender to receiver (RFC 4341, 6.1.2). We can generalise this, since Ack Ratio is an NN feature (RFC 4340, 11.3), for NN features have a less complicated state diagram (the receiver of a Change option does not change state).

The consequence for the implementation now is that negotiation of SP features, as well as features which have implicit dependencies (the choice of CCID for instance implies values for the Ack Ratio and Send Ack Vector feature), is restricted to the initial handshake.

Non-negotiable features, on the other hand, may be `negotiated' any time but will be treated as one-shot message passing: the sender will enqueue a Change option to be sent on the next available packet, and will accept Confirm options for NN-features only.

2. Implementation concept

The solution presented here for the implementation is map feature negotatiation onto connection setup and disable it for the established connection. The details and justifications are documented elsewhere.

2.1 Tracking state

RFC 4340, 6.6.2 defines three states per each feature. Additional states result from the fact that features may use either (a) default values (6.4) or (b) initialise with values other than the default (detailed description is here).
	enum dccp_feat_state {
FEAT_DEFAULT = 0, /* using default values from 6.4 */
FEAT_INITIALISING, /* feature is being initialised */
FEAT_CHANGING, /* Change sent but not confirmed yet */
FEAT_UNSTABLE, /* local modification in state CHANGING */
FEAT_STABLE /* both ends (think they) agree */
};

2.2  Data structures

The first issue to tackle is that feature values come in two variants:
All currently known SP values have 1-byte quantities. A `unified' data structure avoids having to use separate SP/NN arrays:
	typedef union {
struct {
u8 *vec; /* SP value plus optional preference list */
u8 len;
} sp;
u64 nn; /* all known NN values are integers */
} dccp_feat_val;
The central data structure is a list, sorted in ascending order of feature number.
	struct dccp_feat_entry {
u8 feat_num; /* DCCPF_xxx */
dccp_feat_val val; /* intended feature value */
        enum dccp_feat_state    state:8; /* INITIALISING | CHANGING | STABLE */
        u8                      needs_mandatory:1, /* whether Mandatory options are needed */
needs_confirm:1, /* whether to send a Confirm instead of a Change */
empty_confirm:1, /* only set when needs_confirm == 1: send empty Confirm */
is_local:1; /* feature location (1) or feature-remote (0) */

struct list_head node;
};
The needs_confirm flag is cleared as soon as the Confirm has been sent. The above queue is located in different places:

2.3 General processing

For both client and server, changes to feature values (transition DEFAULT => INITIALISING) are made before the connect()/listen() calls and will populate the dccp_feat_negotiation_queue. This queue is multi-purpose, it is used by the

2.4 Client processing

2.4.1 General concepts
These feature states can occur:
CLOSED
REQUEST PARTOPEN
post PARTOPEN
DEFAULT or
INITIALISING
DEFAULT or
CHANGING
DEFAULT or
STABLE
STABLE

The post-processing  of feature states involves to
  1. activate all feature values that have entered the STABLE state and
  2. migrate all features with a DEFAULT state to STABLE state.
2.4.2 Pseudocode
1. For each entry which has registered local changes:
If the value is different from the default feature value,
add entry to the queue (state = INITIALISING, needs_confirm = 0)
If the value is not different from the default but a preference list exists,
add entry with preference list to the queue (state = STABLE, needs_confirm = 0)

2. When sending the first Request
for each feature where state == INITIALISING && needs_confirm == 0:
generate Change option for feature
set feature state = CHANGING

3. Retransmitting Requests: analogous to (2), but done for all features in state CHANGING

4. Receiving the Response from server /* dccp_request_sent_state_process() */
(a) Change Processing:
Change L: find entry in state STABLE or CHANGING with is_local == 0
Change R: find entry in state STABLE or CHANGING with is_local == 1

If Change matches an entry in the queue,
negotiate value, set needs_confirm = 1 (empty_confirm if needed)
If state == CHANGING
set state = STABLE
Else
create entry in queue, negotiate against default value
set needs_confirm = 1 (empty_confirm if needed)
set state = STABLE
(b) Confirm Processing:
Confirm L: find entry in state CHANGING with is_local == 0
Confirm R: find entry in state CHANGING with is_local == 1

If lists/values agree, reorder accordingly, set state = STABLE
else: reset connection

5. Post-processing:
for each entry in the queue
If at least one entry has state != STABLE
reset connection /* negotiation failed */
for each entry in the queue
Activate value
If (needs_confirm == 0)
remove entry
Send Ack /* Confirm options will be generated for entries with needs_confirm == 1 */
Purge queue

2.5 Server processing

2.5.1 General concepts
The following states can occur: 
LISTEN RESPOND
OPEN
post OPEN
DEFAULT or
INITIALISING
CHANGING,
DEFAULT, STABLE
DEFAULT or
STABLE
STABLE

The post-processing of feature states is analogous to the client case.
2.5.2 Pseudocode
1. For each entry which has registered local changes: do as in (1) at the client

2. when receiving the Request, parse options: /* dccp_v{4,6}_conn_request() */
/* Confirm options are ignored, since no feature has state == CHANGING here */
Change L: find entry in state INITIALISING or STABLE with is_local == 0
Change R: find entry in state INITIALISING or STABLE with is_local == 1

If there is no queue entry for the Change option,
create one in state STABLE
Else
set entry's state to STABLE
set needs_confirm = 1 (empty_confirm if needed)

3. when sending the Response
for each feature where state == INITIALISING && needs_confirm == 0
generate Change option
set state = CHANGING
for each feature where state == STABLE && needs_confirm == 1
generate Confirm option
set needs_confirm = 0

4. when receiving the Ack/DataAck /* dccp_v{4,6}_request_recv_sock */
/* Change options are ignored to avoid unterminated feature negotiation */
Confirm L: find entry in state CHANGING with is_local == 0
Confirm R: find entry in state CHANGING with is_local == 1

If no matching entry can be found in the queue
ignore the Confirm
else
negotiate Confirm against queue entry
If negotiation fails according to SP/NN rule
reset connection
else
update entry and set state = STABLE

5. Post-processing:
if at least one entry in the queue has state != STABLE
reset connection /* feature negotiation failed */
create child socket
activate all negotiated feature values in the child
purge feature negotiation queue

3. Handling CCID dependencies

Due to the limited amount of messages during connection setup, parsimony is required to avoid having to create extra state and dedicated feature-negotiation messages for the purpose of resolving CCID dependencies.

There are issues for the negotiation and issues for the activation of features.

3.1 Issues for negotiating CCID dependencies

A naive implementation would use a first pair of messages to negotiate the CCID, and another pair of messages to negotiate all such features that depend on the choice of CCID. Two of these four messages can be combined, by piggybacking the Change options for CCID-dependent features onto the message carrying the Confirm option for the CCID. If there had been no incoming Change for the choice of CCID, then the second message would carry both Change options for CCID and CCID-dependent features.

The freedom of choice is further limited by the fact that all negotiable (i.e. non `non-negotiable') features are server-priority features, so that the logical choice of negotiating CCID-dependent features would be:
  1. The client sends a Request with Change options for local/remote CCIDs.
  2. The server negotiates choice of CCIDs with its own preferences. If there is no conflict, the server determines the choice of CCID and adds CCID-dependent features onto the Response.
  3. The client is either ok with these choices and sends corresponding Confirm options on the Ack, or resets the connection.
The next point to consider is when the client in (1) does not send a Change option for at least one of the CCIDs. Then the server would need to assume that the client is using the default CCID (CCID 2). Since this is a singleton value (see also below), the negotiation would only succeed if the server happens to use the default CCID, or has the default CCID value as member in its preference list.

As a consequence, the CCID-dependent features in (2) above would reflect those deriving from the default CCID. If the client in (3) is then not ok with these, the inevitable reset is probably the best thing to do, as this means that the client is not even capable of supporting the default CCID.

Since the choice of the CCID is a server-priority issue and thus cannot be foreseen by the client, it is probably unwise to add CCID-dependent feature-negotiation options onto the initial Request. But there are exceptions to this rule, as the following table, presenting the results of negotiating CCIDs, illustrates.

Client Pref.
Server Pref.
Result of negotiation
{2}
{2}
2
{3}
FAILURE
{2, 3}
2
{3, 2}
2
{3} {2} FAILURE
{3} 3
{2, 3} 3
{3, 2} 3
{2, 3} {2} 2
{3} 3
{2, 3} 2
{3, 2} 3
{3, 2} {2} 2
{3} 3
{2, 3} 2
{3, 2} 3

When the client advertises singleton values for the CCID (as in the above example of using the default value), then the server either complies with this choice or resets the connection. Hence it would be acceptable for the client to add CCID-dependent feature values when it is only advertising a single CCID value.

The situation is different when the client, the server, or both advertise alternatives regarding the choice of CCID. In the table the changes of preferred CCID value are highlighted in blue for the server, and in green for the client. There are a total of four cases where the preferred CCID value changes at the client; the two (blue) cases at the server result from the fact that singleton values from the client dictate the outcome. Since the number of CCIDs is expected to grow in the future, the number of possible combinations (ambivalences) will increase.

Hence we can deduct for the client that it should not advertise any CCID-dependent settings when at the same time it advertises more than one CCID for the associated half-connection.

For the server, we can analogously say that it should initially not make any assumptions about CCID-dependencies arising from the preferred CCID value when its preference list contains more than one entry. But it is safe to say, with regard to step (2) above, that the server should add its CCID-dependent settings when it has negotiated the CCID.

Thus for the server it remains to deal with the case when the client has not communicated Change options for both the local/remote CCID. In this case, for each missing Change option, the server should derive the CCID-dependent settings from the default CCID 2 as per RFC 4340, 6.4. The test as to whether the client had or had not sent CCID options can only be made after all other feature-negotiation options, received on the Request, have been processed. A good place for performing this test, and for computing the dependencies in general, is just before sending the Response message; since this stage is only reached if the connection had not previously been reset due to feature incompatibilities.

3.2 Issues for activating CCID-dependent features

A robust implementation needs to check that, upon completion of feature negotiation, the resulting settings are consistent with regard to the choice of CCID. This is in line with other consistency tests to ensure that no features have been negotiated or implicitly been confirmed that are not supported by the host. Desirable and required tests are:
  1. If CCID2 is enabled at the sender, then the receiver should have Send Ack Vector enabled (mandated in RFC 4341, section 4). Likewise, a RX CCID of 2 means that the host locally should have Send Ack Vector enabled.
  2. For CCID3 sending Ack Vectors is not required and in the current Linux implementation it is not advisable to enable Ack Vectors in combination with CCID3. Hence a RX CCID of 3 should mean that Send Ack Vectors is disabled; likewise a TX CCID3 should mean that the remote peer has its Send Ack Vector disabled.
  3. A similar issue arises with regard to Ack Ratio: it would be good to set it to 0 when using CCID3 (with the meaning of RFC 4340, 11.3 that  the sender does not use Ack Ratio).
  4. The CCID3 implementation does not implement the Loss Intervals option from [RFC 4342, 8.6], it uses the Loss Event Rate option (8.5) instead. The latter should thus be announced using the Send Loss Event Rate option (8.4). Since this feature is located at the RX side of the half-connection, an RX CCID3 should imply that this feature is on; and a node with a TX CCID3 should make sure to add ChangeR options for this feature when sending the CCID-dependent feature-negotiation options.