Tesseract Open Source OCR Engine

Tesseract is a free optical character recognition engine originally developed at Hewlett-Packard and currently developed by Google. It is a raw OCR engine - it has no document layout analysis, no output formatting, and no graphical user interface. It only processes a TIFF or BMP image of a single column and creates text from it. It can detect fixed pitch vs proportional text. The engine was in the top 3 in terms of character accuracy in 1995. The source code will read a binary, grey or color image and output text.

Tesseract can process English, French, Italian, German, Spanish, Brazilian, Portuguese and Dutch and can be trained to work in other languages as well.

Sources inherited from project openSUSE:13.2
Download package
Checkout Package
osc -A https://api.opensuse.org checkout openSUSE:13.2:Update/tesseract && cd $_
Create Badge

Build Results

Refresh

Source Files

Filename	Size	Changed
tesseract-ocr-3.02.02-doc-html.tar.gz	0010635901 10.1 MB	over 13 years ago
tesseract-ocr-3.02.02.tar.gz	0003890393 3.71 MB	over 13 years ago
tesseract.changes	0000004775 4.66 KB	over 12 years ago
tesseract.spec	0000003627 3.54 KB	over 12 years ago

Comments 0

No comments available

Places

Actions on this page