The magazine of the Melbourne PC User Group

Printable Characters: How Many?
Major Keary
 


From time to time one sees references to "the standard visible characters", or "printable characters", which refers to the keyboard characters that produce something visible on the screen and which excludes the characters produced by any multi-key combinations such as alt + keypad nnn in Microsoft Windows applications.

Real ASCII is seven-bit, which produces 27 (128) code positions. However, in computerspeak we begin counting at 0 (zero), so the range is 0-127 (00-7Fh). The thirty-two ASCII code positions between decimal 00 (00h) and 31 (1Fh) are reserved for control characters (the mysterious ACK, BS, and so on) and 127 (7Fh) is conventionally 'blank' in ISO character sets. In seven-bit ASCII (ISO 646) it is shown as DEL, but that is not the delete key - DEL is one of the original telegraphic control codes. It is common for position 7Fh to be used for a character in TrueType and Adobe Type 1 fonts.

It is important to distinguish between real ASCII and eight-bit ASCII, which extends the available code positions to 256. The characters with decimal values between 128 (80h) and 255 (FFh) (remember, we are starting the count at zero) require special key strokes - in Windows alt + keypad nnn. Seven-bit ASCII was originally specified in ANSI X3.4 and was soon after adopted in ISO 646-1973. Eight-bit 'ASCII' was a later development, but that's another story.

The printable characters are used in what is called 'plain text' or ASCII and are used (but not necessarily all of them) in writing program source code. There are ninety-five printable, or visible, characters: a-z, A-Z, 0-9, and the thirty-two symbols: !"#$%&'()*+,-./:;<=>?@[\] ^_`{|}~ and space.

That adds up to ninety-five characters, but the figure is sometimes given as ninety-three. There are two reasons why some people arrive at the lower figure. One is that they forget the space character; it might not be visible in the sense that it has no glyph, but its presence can be observed. It is an important character in programming and without the space plain text would be very difficult to read.

The other reason is one of arithmetic; a simple subtraction: 126-33 produces the erroneous ninety-three (code position 126 (7Eh) is where the characters stop in real ASCII). They forget that the count starts at zero, but in real-world arithmetic we start counting at 1. If we adjust the range to fit every-day arithmetic it would be 1-128, and the range of printable characters would lie between 33 (space) and 127 (~). The subtraction should be 127-32=95. Not sure about that? You can always type out the keyboard characters and count them.

The space can cause confusion in code examples, especially if one falls at a line break. One method used to overcome that is the use of a special symbol to represent the space. In his books on TeX Donald Knuth uses the square cup to represent a space (Unicode 2294, and keyed to t in the Lucida Math Extension character set).

One of the features of the 'standard' printable characters is that at least thirty-seven national variants of ISO 646 exist. Some countries have more than one standard: the U.K. and Switzerland each have three. The variations occur in the twelve symbols #$@[\]^`{|}~ with the most common substitutions being
£ for #, ¤ for $, and § for @.

It may all seem a bit esoteric, useful for trivial pursuit and little else. However, for a number of purposes, such as programming and writing scripts (such as JavaScript) standard plain text is essential.

For those who want to count the characters, here they are:
 
space!”#$%&’()*+,-./:;<=>?@[\]^_‘{|}~/0123456789:;<=>?@
ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_‘abcdefghijklmnopqrstuvwxyz{|}~

Reprinted from the July 2003 issue of PC Update, the magazine of Melbourne PC User Group, Australia

[ About Melbourne PC User Group ]