Text and Metadata Extraction Tool and Library

Edit Package texterize
http://texterize.org

Texterize is a text and metadata extraction tool and library which can be used
to quickly get the text content of a file. It currently supports file formats
like PDF, Excel, Powerpoint, Word, RTF, WordPerfect, MP3, Ogg, and all
OpenDocument file formats. The output of texterize is either text or XML. It is
also designed to work with Unicode input and output, and the default output
character set is UTF-8. Texterize also has a recursive mode so that whole
directories (or whole filesystems) can be converted to text. This recursion
also works through archive files and compressed files like zip, tar, and gz
files.

Source Files
Filename Size Changed
texterize-0.1.3.tar.bz2 0000726938 710 KB
texterize-fixes.patch 0000001232 1.2 KB
texterize.spec 0000003959 3.87 KB
Comments 0
No comments available
openSUSE Build Service is sponsored by