Encodings Before ASCII From a Mechanical Perspective

April 25, 2014 at 5:27 pm · Filed under Technical

While various flag and signal based encoding systems have existed since pre-history (signal fires, smoke signals), true digital encodings were not needed until attempts to communicate via electrical signals in the form of the electromagnetic telegraph. The early telegraph systems in the 1830s and 1840s were extremely limited in terms of electric components, with basically only batteries and electromagnets available. In another running theme throughout technology, there was also a desire to make the systems usable without the operators having to learn complex codes, and this placed further constraints on the systems. With these components, three basic systems were invented.

Dial/Letter-Printing Telegraphs

The most complex and user-friendly system was the dial telegraph, self-rotating telegraph, and letter-printing telegraph, which all used essentially the same underlying mechanism. In this system, a wheel with each of the letters/symbols to transmit rotated on the sending side, alternately switching the signal between 0 and 1 with each letter that passed. This signal was used on the receiving side to advance an identical wheel, thus keeping the two synchronized. The operator on the sending side would momentarily stop the wheel at the letter they wanted to send. The signal stopped alternating while the wheel was held, and thus the receiving station also stopped rotating. An operator could read off the dial, or in the case of the letter-printing system, the cessation of motion would trigger the printing mechanism to stamp the letter on a piece of paper. Details on how this kind of device are given here in a run down of the mechanisms in the House printing-telegraph from the 1830s.

Note that this is not really a character encoding like we would think of today, since there is no fixed representation of any letter. For example, if we wanted to encode D as the first letter of the transmission, we would send 1010000000, since we need to rotate the dial around from the starting position. However, if the D came after a B, then we only need to send “10000000”, since we start at the B. In the example linked above, there are 28 letters on the wheel, and so we need to send on average 14 bits per character, plus we need to keep the signal paused while the letter is printed. Despite this system being highly machine-compatible, it is thus not very efficient. It also suffers from robustness issues. Any noise on the line that results in a bit being lost means that the sender and receiver are put out-of-sync, and the rest of the message becomes scrambled. This is not a problem for the other two systems.

Needle-Pointing Telegraphs

The next type of system was the Wheatstone needle-pointing system. This was based on having several needles which could be tilted left or right. The signals for each needle were transmitted in parallel. Thus a 2-needle system requires 2 lines for transmission. Represented in more modern terms, we could say that it was a ternary system allowing three values to be transmitted: negative voltage, no connection, and positive voltage (giving logic values of -1, 0, and 1). The original telegraph patented by Wheatstone employed 5 needles, but actually required 6 signal wires for communication, and used a circuit-like signaling system unlike anything we would use for communications today. (I’ve run through the details on how the device worked here.) Although we would expect a 6-wire ternary system to give 3^6 = 729 different code combinations (or 728 different non-null combinations), the arrangement employed by Wheatstone was only capable of 30 different code combinations. Unlike the letter-printing telegraph above, this system was not particularly automation-compatible, and could not be attached to a printing apparatus. The need for multiple wires to send a message was also inefficient, especially on low volume transmission paths, and eventually the system devolved into a single needle system where the two codes of left and right were treated as dots and dashes in Morse code. This gives us Morse code but with slightly different signaling.

Morse Telegraph

Finally, we have the most well-known system, Morse code. Interestingly, the physical device design of the system originally proposed by Morse was much more complicated than the code key/buzzing register that finally emerged. (This patent shows the details of the mechanical encoder originally envisioned by Morse.) As most people know, the code consisted of dots and dashes. The code is extremely flexible in terms of usage without advanced technology. All you need is a single line alternating between 0 and 1 and a buzzer and code key. With that, it can be used with just about any means of communication. However, Morse code is not particularly machine-compatible. Although we can easily represent the code in binary by treating a dot as “1” and a dash as “111”and the space between dots and dashes as “0” (which gives approximately the correct timing of hand-coded Morse code), there is no easy way to mechanically take a Morse code signal off a wire and convert it into binary code, especially in the 1800s. One of the salient features of Morse code is that it is a variable length encoding, varying in length from the letter E (dot = “1”) to the number 0 (dash-dash-dash-dash-dash = “1110111011101110111”). More frequently used letters have shorter codes, making it very efficient in terms of line usage, much better than the letter-printing telegraph above.

And the Winner Is…?

You might think that from among the various different types of telegraphic encodings, the one that was technologically superior would win out and become the de facto standard. However, life is more complicated than that. Initially, dial type telegraphs dominated in Europe where the dial-type telegraph was patented by the German Siemens. In England where Wheatstone first patented his needle telegraphs, needle telegraphs ruled. In America, Morse was the first person to patent a telegraph, and Morse code ruled. However, Morse was not the only game in town. Morse code was efficient but required trained operators, making it ideal for telegraph companies. The letter-printing telegraph, however, was much more user friendly, and was often used in private telegraph networks, such as for communication between bank branches where line congestion was not an issue. Ultimately, though, Morse code was the most efficient encoding for the equipment of the time, and was used extensively throughout the world even into the 1900s.

Multiplexing and Automation – Drivers For New Encodings

Not unlike the Internet of today, use of the telegraphic networks grew rapidly and congestion became an issue. The expensive solution to congestion is to simply build more lines. The inventor solution to congestion, however, is to multiplex a single line (that is, to allow multiple transmissions to run simultaneously down a single wire). A lot of different schemes were proposed through the 1850s, 1860s, and 1870s, and these were predominantly time-division systems. Frequency-division systems came later, and are part of the story of the invention of the telephone (and so I’m not going to discuss them here).

Time-Division Multiplexing

Now, time-division multiplexing is quite easy to understand. Let’s say you have a wire you want to share among 4 pairs of people for telephone communication. What you do is attach time-synchronized multiplexers on the ends of the wire. First, the first pair of people are connected, then after some fixed time the first pair are disconnected and the second pair are connected, and so on. Each pair thus gets to use the line for a quarter of the time. Now, a key factor here is the frequency of the switching. Let’s say you assign each pair 1 second. That means that their line is repeatedly connected for 1 second and disconnected for 3 seconds. It’s going to take a lot of effort for the people to use the line. In fact it’s going to be a nightmare. For a computer, however, this is not a problem. This kind of synchronous time-division multiplexing is thus suitable for digital communication but not for analog communication.

Asynchronous Time-Division Multiplexing

Now, let’s speed up the multiplexers and assign each pair 0.000025s. Now you’re free to talk asynchronously of the multiplexing rate. This is great for analog communication, and is how telephones work. This is also great for our telegraphic encodings, where the timing is basically determined by the analog operator. Here is an example of an interesting quadruplex system of this kind that used tuning forks to synchronize the multiplexers on either side of the transmission line. (The multiplexer ran at 72 Hz – sufficient to multiplex human-keyed Morse code which maxes out at around 12 Hz.). The problem, of course, is that you need to have some kind of buffer or latch circuit to hold the previous value of a virtual circuit while the other circuits are using the line. Without this, the system isn’t going to work very well. Worse still, the signaling used by Morse code is not compatible with the most basic buffer arrangement, and so the early attempts at this kind of multiplexing were not particularly successful.

Synchronous Time-Division Multiplexing

Where things get interesting, however, is synchronous multiplexing systems. Here is an example of a system for multiplexing 4 Morse code circuits onto a single wire. This system attempts to send one single character per multiplexing timeslot. It employs a keyboard which has a binary representation of the Morse code alphabet, and requires the operator to press the keys in synchronization with the multiplexing speed. It’s interesting because it is so close to being a digital communication system but doesn’t quite make it. While the transmitter sent out the binary representation of the Morse characters, they were received in an analog manner. Here, we can see the problem with Morse. The binary representation of Morse required 15 bits for the longest letter, and thus 15 bits were sent on each multiplexer cycle. This was not only wasteful of bandwidth, but it also meant that automation of the receiver would require decoding of 15-bit long binary numbers, something far beyond the capabilities of the largely mechanical devices of the time.

Baudot

The big breakthrough into the digital age was made by the Frenchman Baudot with a device he patented in 1882. It is an extremely interesting device, and I have detailed how it worked here. Baudot is most famous for the code that his device used, the Baudot code. This is a 5-bit fixed length binary encoding. Like the device in the previous section, Baudot employed a synchronous multiplexing system. However, because the code was only 5 bits long, it was within the realm of being decoded by a purely mechanical system, and the system he used was quite ingenious. Once the Baudot device had demonstrated the utility of fixed length binary encodings, there was no going back.

Control Codes 1 – Shift States

Baudot’s original code was limited to 32 characters. However, this was soon near-doubled by the addition of shift codes – codes that switched the character set. They were called shift codes because they literally shifted the printing mechanism between two parallel type wheels. Looking at Baudot’s original device, we can see that he already had all the mechanisms necessary for compare-and-actuate style operation. Using two special codes to shift the type wheel was a fairly trivial addition. With 2 codes allocated to shifting in each 32 code set, the resulting encoding offered 60 printable character codes.

Control Codes 2 – Page Printing

Despite the level of automation achieved by Baudot, his device still printed a single long strip of paper. The next development was to produce a more typewriter style of output. This was achieved by Murray, who created his own 5-bit code which also added the carriage return (CR) and line feed (LF) control codes. Now operators could type on something that looked like a typewriter, and get printed output the same as would be produced by a typewriter. This was the teletype. The Baudot device had shown the need for punched tape instead of having operators working synchronized with the machine, and so Murray also added the DEL character – which indicated a character that had been erased by punching out all of the holes in the paper tape.

Control Code Chaos

After this point, quite a number of different 5-bit encodings were used by different manufacturers with different equipment. A large part of the difference was the different control codes. These included things such as BEL (ring the bell on the receiver), WRU (who are you – for querying the remote device), as well as codes that would activate or deactivate the motor on the receiver, etc. depending on the application.

7-bit ASCII

Eventually, the 1960s arrived and it was time for a new standard. More modern electronics had removed the technological barriers that had kept 5-bit codes in use. Plus, computers had arrived and a shifted encoding like Baudot or Murray was not particularly computer-friendly. ASCII was thus intended to provide a standardized unified encoding for both telegraphy and computers. Seven bits means 128 codes. A big 32 code block was assigned for control codes to accommodate the various control codes sought by the different stakeholders. The special values 0 and 127 were assigned NUL and DEL for compatibility with punched tape. On top of this, international language support was added through the use of the BS (backspace) control character and special crafted punctuation marks. (For example, an o with an umlaut (ö) could be printed by printing o, then BS, then double-quotation (“)). ASCII was thus primarily a telegraphy code which could also be used for computing.

ASCII’s competitor was EBCDIC, which was developed by IBM around the same time. EBCDIC was only designed for computers, and was not a telegraphic encoding. Given that EBCDIC was specialized for computers where ASCII was more for telegraphy, you might expect EBCDIC to win this battle. However, the telegraphy market was huge whereas the computer market was tiny. Printers and teletypewriters that talked ASCII were everywhere. On top of this, ASCII was a standard developed through consultation with many manufacturers, whereas EBCDIC was IBM’s proprietary encoding designed to solve their specific needs. As we all know, ASCII won. (If you are interested in all of the gritty detail of the 5-bit encodings and the development of 7-bit ASCII, there is a detailed explanation here.) There are also several books on the topic:

Related Posts

Permalink

1 Comment

F2 said,

April 28, 2014 @ 11:53 am

“There are also several books on the topic:”

The column suggested your post was truncated before what sounds like an interesting bibliography?

Thanks for the great writeup! It was a fun Sunday read.

Best -F

RSS feed for comments on this post

GT!Blog

Encodings Before ASCII From a Mechanical Perspective

1 Comment

F2 said,

Pages

Categories

Archives

Meta