A checksum is a simple error-detection scheme in which each transmitted message that results in a numerical value based on the value of the bytes in a message. The sender places the calculated value in the message (usually in the message header) and sends the value with the message. The receiver applies the same formula to each received message and checks to make sure the accompanying numerical value is the same. If not, the receiver can assume that the message has been corrupted in transmission.
The simplest form of checksum, which adds up the bytes in the data to form a sum value, cannot detect a number of types of errors. In particular, such the checksum value is not changed by:
More sophisticated types of redundancy check, including cyclic redundancy checks (CRCs) are typically used at the link layer. These are designed to address these weaknesses by considering not only the value of each byte but also the order of the values. The cost of the ability to detect more types of error is the increased cost of computing the checksum. Packet corruption is not only caused by errors introduced by the physical layer. It may be, and is, also (on occassions) caused by bugs in host and router hardware and software. Even if every link implemented strong error detection in the form of frame CRCs, it is still essential that end-to-end checksums at and above the IP level are used at the receiving end host
The Internet Protocol (IP) and most higher-layer protocols of the Internet Protocol Suite (ICMP, IGMP, UDP, UDP-Lite, TCP) use a common checksum algorithm to validate the integrity of the packets that they exchange. The IP (IPv4) header checksum protects only the IPv4 header, while the TCP, DCCP, ICMP, IGMP, and UDP checksums provide end-to-end error detection for both the transport header (including network and transport layer information) and the transport payload data. Protection of the data is optional for applications using UDP [RFC768] for IPv4, but is required for IPv6.
When used above the IP-level (e.g. in the UDP, TCP, and DCCP transport protocols), the checksum algorithm includes both the data bytes in the protocol data unit and some additional bytes, known as a pseudo-header (built from the information present in the IP network layer header).
The 16-bit checksum field in the header is zeroed prior to checksum calculation. The calculation is made 16 bits at a time (e.g. 2 octets). If the datagram is odd-numbered in length, a zero octet is virtually added at the end, so that 16-bit maths can be used throughout. If the computed checksum is 0, it is transmitted as all ones (the equivalent in one's complement arithmetic).
To ease implementation, the addition operations in the checksum can be performed using 32-bit maths (see below for an example for UDP processing). This is a natural size of word in many modern processors. However, most processors only provide a 32-bit add instruction, and do not provide an instruction that independently adds two 16-bit quantities contained in one 32-bit word. This therefore requires the algorithm to be modified slightly to take into consideration the carry that may result at bit 15.
Standards Documents:
Postel, J., User Datagram Protocol, RFC 768
J.Postel. Internet Protocol, RFC 791
Postel, J., Transmission Control Protocol, RFC 793
Braden, R., Borman, D., and C. Partridge, Computing the Internet Checksum, RFC 1071