Polled OCR (Optical Character Recognition) Engine

Annapolis Technologies' Document Management Solutions include a highly accurate OCR solution. The OCR Power ServerTM (OPS) is the world's most accurate device for recognition of English-language, machine-produced characters. The system employs advanced statistical processing capabilities to achieve unprecedented levels of OCR accuracy.

The OPS is intended for high-volume, high-accuracy applications, either for full-text conversion or for forms processing. Input images should be scanned at 300 dpi for highest accuracy output.

In full-text conversion, the OPS is used to convert text from hardcopy documents into electronic form. The text itself may be used as a database in a text-retrieval environment, or it may be used as a robust index to images in an electronic document management system. Typical full-text applications include correspondence tracking, litigation support, and conversion of technical manuals.

Editing the output of the OPS has been optimized with the use of confidence markers. In full-text conversion, the markers can be used to highlight words to be checked in a post-OCR editing process. The OPS produces highly reliable estimates of recognition confidence, eliminating instances of false positives where the data was incorrect although marked as accurate, typically 5 times more reliable than competing products. Although there is a large range of applications where the raw OPS accuracy is acceptable without editing, the labor involved for existing OPS applications requiring editing is typically one-third of that which would be required using competing OCR systems. If human verification accuracies are required Annapolis Technologies offers this service as well.

A US Geological Survey Book form 1890 was used for a sample on the back of this page. The non-standard fonts of old typeset often produces poor OCR results. As you can see by the sample provided without human cleanup the OCR is nearly perfect. Please provide us a sample so that we can show you the accuracies of the OCR on your own data.

US Geological Survey Book
Second Annual Report (1889-1890)

OCR IMAGE OCR TEXT
IRRIGATION SURVEY-SECOND ANNUAL REPORT.

By J. WT. POWELL, DIRECTOR.

HYDROGRAPHY.

SCOPE OF WORK.

Any discussion of the extent to which the lands of the arid region can be redeemed by irrigation, or of the engineering problems involved, requires a comprehensive knowledge of the distribution of water in each catchment basin and the amount available at various points in that basin. As this amount is never the same from day to day and varies greatly from year to year, it becomes necessary to continue investigation