Wednesday, November 19, 2008
Wednesday Math, Vol. 48: Error correcting codes
Today's story revolves around Richard Hamming, a pioneer in computer science who was one of the many young geniuses who worked on the Manhattan Project. After the war, he went to work at Bell Labs and stayed there for several decades, until he left to become the head of the computer science department at the Naval Postgraduate School in Monterey, CA in 1976.
While I am considered an O.G.P. (Original Game Programmer), Hamming was an honest to Lenny O.P. (Original Programmer). Programming has always been a tricky business, since you can make typographical errors, which the computer will catch unless you misspell a word and the misspelling is another legal word, or logical errors, which will tell the computer to do something legal, but not the thing you wanted it to do. (To kids who want to program, here's a little tip: "less than" is NOT "less than or equal to". Make that mistake once in a few thousand lines of code and your life can be misery for weeks.)
Hamming had a problem for most of his career I only had to deal with as a college student.
The problem with punch cards is something most 21st Century people can readily understand through bitter experience. Chads. Hanging chads, pregnant chads, all kinds of damn chads that should have fallen away but didn't, and so the machine that reads them misunderstands you when you didn't really make an error. For example, the letter 'H' turns into the seven bit pattern 100 1000, where 1 is a hole punched in the card and 0 means no hole punched. If one of the chads doesn't fall, the punch card reader will think you typed '@', which reads 100 0000, or possibly the symbol for back space, which is 000 1000.
(Note: while punch cards are a thing of the past, garbled data transmission of ones and zeros is not. Hamming's methods are still in use in some applications.)
Hamming's idea was this. Add extra bits, sometimes called the checksum bits, but only have so many legal patterns, and the legal patterns spaced far enough apart that if there is one mistake at one position, the reader of the code will see that a mistake was made and an illegal code was sent, but every illegal code is exactly one mistake away from a legal code, and the reader will correct the error.
The method uses binary matrix multiplication and null spaces of linear transformations and a lot of other stuff I won't go into detail about, but it is simple enough that it is taught in the first course of linear algebra that I am currently teaching at Berkeley City College. My readers will have to take my word when I say it was a very clever and elegant solution to a very serious problem in early computer science and the general method still has applications today. Hamming even came up with methods that could correct more than a single mistake per code word. His method was robust and could spot when the mistake was an 1 turned into a 0 or vice versa, though in the original application there was no need to worry about a 0 turning into a 1 by mistake, which would be a hole accidentally punched when the typist didn't intend it.
Nice thinking, Dr. Hamming!