After finding a method that would accurately remove the background pixels from the image, resizing the image using a BSpline filter, and using a C extension to make the program more efficient, we have achieved our initial goal. We can now effectively automate the conversion of scanned note images to full web site creation using our cleaned and resized images as content.
Although the solution is not exact, most of the test images produced very pleasing, readable images. The average case produced results above 99 percent accuracy, with less than an estimated 200 pixels per image incorrectly identified as background out of an average of the 490,000 total pixels in the final image.
Of course, there are exceptions to this. Variations in the types of inks, colors of the inks, quality of the scanner, time the lamp on the scanner has been warmed up, hand writing style, and quality of the paper being used all introduce various effects on the final output images. However, using white or yellow paper that has light blue and / or light red lines with standard red, black, and blue inks will produce the results stated above. The one exception to this, is that certain blue inks do not work well with our algorithms, and will not produce comparable results. The best results are obtained using yellow or white paper with light lines and black and red inks.