Implementing shutdown(2)

This memo summarises notes on the implementation of shutdown() for DCCP. The shutdown() system call originated from TCP applications. Since DCCP connections can, like TCP, have data flowing in either direction, it makes sense to reconsider the use of this function for the benefit of DCCP.

1. Background and motivation

Multimedia streaming can in principle be considered as one-directional stream of data packets sent by the streaming source to the consumer of the media packets. DCCP is a very generic protocol and allows data traffic to flow in both directions of the same connection.

For many applications this is unnecessary overhead: by shutting down one direction of the full-duplex connection, a better performance can be achieved, since the processing costs per packet are reduced. This implies better responsiveness, scaleability and use of computing resources.

1.1 DCCP half-connections

A DCCP connection splits into two separate half-connections, each possibly having a different congestion control ID (CCID). This is illustrated in the following schema.

The situation is more complicated than with TCP, which has only read()/write() for each end. The functions used for the active end of a half-connection are:
On the receiving end of the half-connection we have an active and a passive function:
The point of using shutdown() in this context is in reducing processing complexity. Without shutdown(), each received packet is processed twice - both by the TX CCID via tx_packet_recv and by the RX CCID via rx_packet_recv.

1.2 The shutdown() function

This function was originally developed for TCP's full-duplex service, it allows to shut data transfer in one or both directions.
There is one subtlety which distinguishes TCP from DCCP - the latter has no half-closed state [RFC 4340, 4.6]. Hence the semantics of shutdown are not exactly identical, but quite similar. Making this similarity precise is the purpose of the present page.

A classical example of the shutdown function can be found in Stevens' volume 1, section 18.5:

2. Using shutdown() for DCCP

In DCCP the same example as above is not possible since the signalling (sending FIN) is missing. The original meaning of FIN in RFC 793 was "No more data from sender". But the lack of such signalling to the peer is no disadvantage, since shutdown can still be used locally to reduce processing costs.

Furthermore, as described in section 11.7 of RFC 4340, Data Dropped options can be used to signal that data packets have not reached the application. This option is sent on packets which carry an ACK number (hence it can not be used on  Request or pure Data packets), a packet drop is indicated by a high-order first bit; and the reason for dropping the packet is contained in a 3-bit subsequent drop code. The relevant one in this context is Drop Code 1, "Application not listening"; it is described in section 11.7.2 of RFC 4340.

2.1 Basic concept

The basic use of shutdown() which suggests itself in this context is:
Lastly, shutdown(SHUT_RDWR) exists but is a bit pointless. To maintain compatibility with TCP (see e.g. tcp_poll() in net/ipv4/tcp.c), it should nevertheless be supported.

2.2 Subtleties

The reading side of the RX CCID is straightforward: if SHUT_RD is set, then no further input will be accepted, hence no packets are  delivered to the  RX  CCID. Should still data packets arrive after the local end of the half-connection has issued shutdown(SHUT_RD) on its side, an Ack with a Data Dropped option, Drop Code 1, "Application not listening" is sent.

The writing side (TX CCID) needs a bit more sophistication. The situation is clear when the TX queue is empty at the time the shutdown(SHUT_WR) is called. In this case, all further attempts to write to the socket will be caught by the socket API and lead to a write error, as intended. Furthermore, since no more packets are going to follow, we can close the end leading to tx_packet_recv,  since it follows that if no packets are going to be sent, the congestion control is also no longer needed.

The situation is different if the TX ringbuffer contains several packets at the time shutdown is called. In this case, we must keep the input for tx_packet_recv open, since this would cut off the necessary control traffic to regulate the congestion control for the still pending packets. The proposed solution here is to issue the SHUT_WR on the socket so that no more packets will enter the TX queue, but to defer closing the feedback input (tx_packet_recv) until the last packet has left the output TX queue.

3. Further work

The use of Data Dropped options is not currently supported in Linux DCCP, but work is under way to support it. Until then, the signalling proposed above should be marked with a FIXME.