LogoopenSUSE Build Service > Projects
Sign Up | Log In

Tesseract Open Source OCR Engine

Tesseract is a free optical character recognition engine originally developed
at Hewlett-Packard and currently developed by Google. It is a raw OCR engine -
it has no document layout analysis, no output formatting, and no graphical user
interface. It only processes a TIFF or BMP image of a single column and creates
text from it. It can detect fixed pitch vs proportional text. The engine was in
the top 3 in terms of character accuracy in 1995. The source code will read a
binary, grey or color image and output text. 

Tesseract can process English, French, Italian, German, Spanish, Brazilian
Portuguese and Dutch and can be trained to work in other languages as well.

     Ray Smith <theraysmith@users.sourceforge.net>

Source Files

Filename Size Changed Actions
tesseract-ocr-3.02.02-doc-html.tar.gz 10.1 MB almost 6 years ago
tesseract-ocr-3.02.02.tar.gz 3.71 MB almost 6 years ago Download File
tesseract-ocr-3.02.eng.tar.gz 12.1 MB over 5 years ago
tesseract-ocr-3.02.equ.tar.gz 809 KB over 5 years ago Download File
tesseract-package-creator2.pl 1.93 KB over 5 years ago Download File
tesseract.changes 2.91 KB over 5 years ago Download File
tesseract.spec 3.22 KB over 5 years ago Download File
tesseract.spec.in 2.2 KB over 5 years ago Download File

Comments for home:nandcd:11.4 (0)