After several weeks of applying a few promising image processing algorithms, and modifications of these image processing algorithms, it became apparent that none of them, in any combination, would correctly solve this problem. Luminance based calculations, edge detection algorithms, line detection algorithms, and noise filtering algorithms applied in many different ways produced decent results, but not anywhere near what is required for a readable image.
After the realization that most, if not all, image processing algorithms were not going to solve this problem sufficiently, I began to think more along the lines of applying color theory to our problem. Even though the luminance based calculations were insufficient, at this point, they were the best we had. With this in mind, I spent a great deal of time examining scanned images, trying to find trends that were common to the background pixels and the ink pixels.
I observed a trend in the background pixels which does have some color theory basis. The observation was that the types of paper we considered, white and yellow, and the corresponding light, red and light, blue lines on them, had high values in either the red and green or the green and blue pixel values. Although at first this seems strange, it does coincide with color theory quite well. As luminance values are based on the sensitivity of the "average" human eye, it is obvious that the human eye is most sensitive to green, then red, and least of all blue. For quick reference, the CCIR 601 luminance equation is:
LUMINANCE = 0.299 * RED + 0.587 * GREEN + 0.114 * BLUE
Similarly, the most common paper colors (white and yellow) and ink colors (blue, black and red) correspond to those colors that are most easily distinguishable by the human eye. That is, the colors on the paper are designed to be easily distinguishable from the colors of ink. Probably the only reasonable exception to this is blue ink, although the normal blue ink pen will still probably work well our solution.
With this in mind, it is understandable that red, black and blue are the most popular ink colors, and white and yellow are the most popular paper colors. Yellow paper has high red and green pixel values, and white paper has high red, green, and blue pixel values. In contrast, red ink has a high red value, but lower green and blue pixel values. Blue ink has a high blue value and lower red and green pixel values. Similarly, black ink has low red, green, and blue pixel values.
Although now a simple observation, following these observations, a simple algorithm can be used to remove nearly all pixels that are background (estimated average > 99%). The following pseudo-code provides a summation of the above ideas and the major algorithm that solved this problem:
for (every pixel in the image)
if (((current pixel's red color component
> 200) and
(current
pixel's green color component > 200)) or
((current
pixel's green color component > 200) and
(current
pixel's blue color component > 200)))
set the
current pixel to background (white)
Now that we had a good way to remove the background from our scanned images, we had to resize our image to put it in 8 1/2 by 11 inch format. After removing the background pixels, our images needed to be reduced in size by a half. Our input images were 1275 by 2100 pixel tif files scanned at 150 DPI, and our output images were 72 DPI in gif format at roughly half those dimensions. We accomplished this resizing by using a BSpline filter at every other input pixel row and column to halve the image:
| 1 | 2 | 1 |
| 2 | 4 | 2 |
| 1 | 2 | 1 |
From our previous removal of the background, we now had a mostly white image, with some text in black or red. After applying this resizing filter, we achieved some very nice antialiasing effects that resulted from the techniques we applied to this point. As Professor Gleicher noted, these antialiasing effects would be even better if we had used a larger filter size.
Printing was another concern, and this provided many problems. First of all, most printer's grayscale printing is very poor (at least for our uses). Grayscaling in laser printers seems to be based on luminance, and though this works correctly, we are working with written text, not fonts. The biggest problem is making the antialiased pixels printable. With luminance based grayscale printing, we get spotty antialiased pixels which, when viewed, makes the text appear blurred. I tried several methods of using pre-defined color values for antialiased pixels, and also simply darkening the antialiased ink pixels, but neither produced very good results.
Our note processing algorithms work quite well, and with a few exceptions produce high quality output images that can be used for web content or general viewing. For the exact solution we used, from scanning to automated web site creation, please see the section titled "Our Solution".