This document identifies a problem
with passive-close in DCCP.
1 Background
DCCP has
no half-closed states,
i.e. it has no analogue of TCP's
CLOSE_WAIT
(passive half-close)
and
FIN_WAIT_1
(active half-close)
states; cf.RFC 4340, 4.6.
The receiver of a CloseReq or Close packet is asked to
subsequently close its end of the connection, and to acknowledge
connection termination by sending a Close or Reset packet, respectively (RFC
4340, 8.3).
Before sending such confirmation, the receiver of a
connection-termination request needs
to have a chance to process
yet-unread data of its receive
queue. Otherwise, immediately following through with a
connection-termination request has the same effect as an abortive release of the connection:
unread data is discarded, leading to unexpected API behaviour.
2 How to reproduce the problem
It was observed in the Linux implementation that
immediately replying with a
Close
to a
CloseReq has the
undesirable consequence of removing all unread data whenever the
Reset answering the
Close arrived too early: data
was sent to the receiver
(and
could be captured on the wire), but the
receiver never got a chance to read it. The client then terminated with
an error message.
A test program to reproduce this behaviour can be found
here.
The problem appears in particular when server and client are started on
the same host (loopback); but it can also be reproduced by repeating
connecting the client to the same server on another host.
A detailed analysis of why this problem occurs can be found
here.
3. Using intermediate states to distinguish active-close and
passive-close
The host needs to be able to
treat
active-close and passive-close separately. This can be achieved
in a variety of ways: status flags can be used which are activated on
active-close, a host can test whether it received a
CloseReq/
Close while there is still
unread data in the queue etc.
The following schema describes these possibilities in terms of two
auxiliary states:
- PASSIVE_CLOSE,
which is entered when a node (client or server) receives a Close packet;
- PASSIVE_CLOSEREQ,
which is entered when
a client encounters a CloseReq
packet.
The client can also enter
PASSIVE_CLOSE/
PASSIVE_CLOSEREQ
from
PARTOPEN;
for simplicity this is not shown in the diagram below.
Note that these states are an
implementation
device: the macroscopic/normative behaviour still conforms to
RFC 4340, 8.4.
3.1 Passive close states
The PASSIVE_CLOSE
and PASSIVE_CLOSEREQ
states represent the two ways a passive-close can happen in DCCP. These
states are required by the implementation to allow a receiver to
process unread data before
acknowledging the received
connection-termination-request (i.e. the Close/CloseReq).
To see why, suppose a node receives a Close and immediately proceeds
to CLOSED
state, after sending the Reset
packet. The application has no time to process data which may still sit
unprocessed in its receive queue: when the socket transitions to CLOSED
state, the input queue is erased and connection state removed. The
application then exits with an error message.
A similar case arises when a client receives a CloseReq, immediately
transitions to CLOSING
state, after sending the required Close
packet: the server will respond with Reset, after which the client
proceeds to TIMEWAIT.
Here again the client has no chance to process outstanding data in its
receive queue; if the Reset
arrives fast enough, the receive queue will be wiped out before the
application (which may be suspended) has a chance to read the data.
3.2 Active close states
The active close states are
CLOSEREQ and
CLOSING when
entered from the
OPEN state.
Not shown above are the retransmissions of
CloseReq and
Close packets in these states
(required as per RFC 4340, 8.3).
A distinction is made in the diagram with regard to active server-close: the sender can
decide to hold TIMEWAIT
state by sending a Close instead
of a CloseReq. In the
implementation, these cases can be distinguished via a setsockopt
option.
3.3 Reducing the number of states
It is tempting to exploit the server-timewait socket flag in order to
merge the
CLOSEREQ
and
CLOSING
states (as is currently done); because, with some tricks, the host
could distinguish these four cases:
- Client sends active-Close
(from OPEN);
- Client had received passive-CloseReq
(from PASSIVE_CLOSEREQ);
- Server sends active-Close
(from OPEN,
timewait-flag set);
- Server sends active-CloseReq
(from OPEN,
timewait-flag not set);
However, the state transitions become messier: from
CLOSING one
would need two different branches plus several if statements which
determine the preceding state. Therefore, it is cleaner to implement
separate
CLOSEREQ
and
CLOSING
states.
4. Simultaneous-close
There are two possibilities of
simultaneous-close:
- two Closes crossing;
or
- a Close crossing
paths with a CloseReq.
4.1 Both ends performing active-close
Receiving a
Close
after just having sent one can happen when both
client and server perform active-close. In this case, either side sends
a
Close packet and each
end enters
CLOSING state.
We then have a
deadlock condition:
the state will only be exited when receiving the
final
Reset,
but that
Reset will not be
received, since each end will continue to
retransmit its
Close.
This leads to a ping-pong of
Close
packets being retransmitted in either direction. The only way to end
this is the
final timeout,
which will take longer than 64 seconds
(RFC 4340, 8.3) to happen.
The message exchange is furthermore
futile:
by performing an
active-close, each end indicates that it is actually done with the data
transfer, and is only waiting for the other end to acknowledge that the
termination phase is over.
To avoid futile and long-lasting retransmissions of
Close packets it is useful to
use a tie-breaker: e.g. let the client
receiving a
Close packet
in state
CLOSING send
a
Reset packet in
response. This
will allow the server to progress to
TIMEWAIT and breaks the deadlock
condition of waiting for the
Reset
that will never arrive. The successful use of a tie-breaker is
shown in the
packet trace below.
In this example the server listens on the port 3456, to which the
client connects from its ephemeral port 49404. Visible is the
simultaneous-close
condition: both ends sending a
Close
(packets 6/7). The client acts as
tie-breaker and sends the terminating
Reset, allowing either end to
terminate.
4.2 Close crossing paths with a CloseReq
When a
Close from a client crosses
paths with a CloseReq from
the server, the client has already entered CLOSING
state at the time it receives the CloseReq.
The requirements of (RFC 4340, 8.3) here state that
the client send another Close:
"The receiver of a valid
DCCP-CloseReq packet MUST respond with a DCCP-Close packet."
Thus, when the second Close
arrives at the sender, the sender may already have torn down connection
state and entered CLOSED (LISTEN) state. As a consequence, the
server will reply with a Reset
packet, Code 3 ("No Connection",
cf. (RFC 4340, 5.6)). The receipt of this second Reset is ignored at the
receiver: due to (RFC 4340, 8.5) nodes in TIMEWAIT state ignore all incoming Reset packets.