The magazine of the Melbourne PC User Group
Printable Characters: How Many?
Major Keary |
|
From time to time one sees references to "the standard visible characters", or
"printable characters", which refers to the keyboard characters that produce
something visible on the screen and which excludes the characters produced by
any multi-key combinations such as alt + keypad nnn in Microsoft Windows
applications.
Real ASCII is seven-bit, which produces 27 (128) code positions. However, in
computerspeak we begin counting at 0 (zero), so the range is 0-127 (00-7Fh). The
thirty-two ASCII code positions between decimal 00 (00h) and 31 (1Fh) are
reserved for control characters (the mysterious ACK, BS, and so on) and 127
(7Fh) is conventionally 'blank' in ISO character sets. In seven-bit ASCII (ISO
646) it is shown as DEL, but that is not the delete key - DEL is one of the
original telegraphic control codes. It is common for position 7Fh to be used for
a character in TrueType and Adobe Type 1 fonts.
It is important to distinguish between real ASCII and eight-bit ASCII, which
extends the available code positions to 256. The characters with decimal values
between 128 (80h) and 255 (FFh) (remember, we are starting the count at zero)
require special key strokes - in Windows alt + keypad nnn. Seven-bit ASCII was
originally specified in ANSI X3.4 and was soon after adopted in ISO 646-1973.
Eight-bit 'ASCII' was a later development, but that's another story.
The printable characters are used in what is called 'plain text' or ASCII and
are used (but not necessarily all of them) in writing program source code. There
are ninety-five printable, or visible, characters: a-z, A-Z, 0-9, and the
thirty-two symbols: !"#$%&'()*+,-./:;<=>?@[\] ^_`{|}~ and space.
That adds up to ninety-five characters, but the figure is sometimes given as
ninety-three. There are two reasons why some people arrive at the lower figure.
One is that they forget the space character; it might not be visible in the
sense that it has no glyph, but its presence can be observed. It is an important
character in programming and without the space plain text would be very
difficult to read.
The other reason is one of arithmetic; a simple subtraction: 126-33 produces the
erroneous ninety-three (code position 126 (7Eh) is where the characters stop in
real ASCII). They forget that the count starts at zero, but in real-world
arithmetic we start counting at 1. If we adjust the range to fit every-day
arithmetic it would be 1-128, and the range of printable characters would lie
between 33 (space) and 127 (~). The subtraction should be 127-32=95. Not sure
about that? You can always type out the keyboard characters and count them.
The space can cause confusion in code examples, especially if one falls at a
line break. One method used to overcome that is the use of a special symbol to
represent the space. In his books on TeX Donald Knuth uses the square cup to
represent a space (Unicode 2294, and keyed to t in the Lucida Math Extension
character set).
One of the features of the 'standard' printable characters is that at least
thirty-seven national variants of ISO 646 exist. Some countries have more than
one standard: the U.K. and Switzerland each have three. The variations occur in
the twelve symbols #$@[\]^`{|}~ with the most common substitutions being
£ for
#, ¤ for $, and § for @.
It may all seem a bit esoteric, useful for trivial pursuit and little else.
However, for a number of purposes, such as programming and writing scripts (such
as JavaScript) standard plain text is essential.
For those who want to count the characters, here they are:
space!”#$%&’()*+,-./:;<=>?@[\]^_‘{|}~/0123456789:;<=>?@
ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_‘abcdefghijklmnopqrstuvwxyz{|}~ |
Reprinted from the July 2003 issue of PC Update, the magazine of Melbourne PC User Group, Australia
|