Computers operate using numbers. They therefore there need a way for a computer to convert letters (and other "characters") to and from numbers so that they can be stored inside the computer and manipulated by the computer. A set of codes, known as "ASCII" (American Standard Code for Information Interchange) are used. These were initially developed for tasks such as sending documents to printers, and many of the commands make sense in this context.
Each letter is assigned a value according to its position within the ASCII table. Every letter, number, punctuation mark, etc. (known as a character) is given a unique code. Note there is a difference between the 8-bit binary representation of the number zero (00000000) and the corresponding ASCII '0' (00110000) printing character.
MS 3 bits 0 1 2 3 4 5 6 7 LS 4 bits\ 0 NUL DLE SP 0 @ P ' p 1 SOH DC1 ! 1 A Q a q 2 STX DC2 " 2 B R b r 3 ETX DC3 # 3 C S c s 4 EOT DC4 $ 4 D T d t 5 ENQ NAK % 5 E U e u 6 ACK SYN & 6 F V f v 7 BEL ETB ' 7 G W g w 8 BS CAN ( 8 H X h x 9 HT EM ) 9 I Y i y A LF SUB * : J Z j z B VT ESC + ; K [ k { C FF FS , < L \ l | D CR GS - = M ] m } E SO RS . > N ^ n ~ F SI US / ? O _ o DEL
Notes:
Conversion of numbers
As an example of the use of ASCII, consider the problem of printing on a screen or printer the result of a numerical calculation.
Suppose the number to be printed is (in binary) 01101100;
The first step is to convert this into decimal;
The answer is 108;
Each digit may be represented by the BCD codes of 0001, 0000, and 1000
(The Hex values of course are 0x1, 0x0, and 0x8)
Each of these digits need to be converted to their own ASCII character codes;
They are (in Hex) 0x31, 0x30 and 0x38:
In binary, 0110001, 0110000 and 0111000.
These are the codes which are sent to the printer
The printer will have been preprogrammed to recognise and print these codes as "108".
If text is being stored in a computer, it is usually stored as a string (a series of ASCII characters, each one of which is stored as one byte). The formatting characters such as space, carriage return and line feed may be included in the string.
Some method is needed for indicating the length of the string, or where the end of the string is. There are two main methods:
The programmer must of course know the convention being used. There is nothing to distinguish bits which mean numbers, from bits which means letters and characters; you have to know what the bits are supposed to mean before you can do anything with them.
The second way is most commonly used in C programs. (see also the "pig" example). Note that if you are using a more sophisticated method of storing text (say with a word processing program) where you want to store details about the font, or the size of characters for example, you need other information as well; but the actual information about the text will still usually be stored as ASCII characters.
The actual input to a computer program is usually a set of strings; a high level language like C not only has lots of functions which can handle strings like this (e.g. strcat(), strcpy(), len()); but when it is actually running its compiler, it is using those same functions to read in the program, which is presented as a series of ASCII characters. Some microprocessors and computer chips have special instructions to handle strings of characters efficiently.
See also:
Author: Gorry Fairhurst (Email: G.Fairhurst@eng.abdn.ac.uk)