Text and Metadata Extraction Tool and Library
http://texterize.org
Texterize is a text and metadata extraction tool and library which can be used
to quickly get the text content of a file. It currently supports file formats
like PDF, Excel, Powerpoint, Word, RTF, WordPerfect, MP3, Ogg, and all
OpenDocument file formats. The output of texterize is either text or XML. It is
also designed to work with Unicode input and output, and the default output
character set is UTF-8. Texterize also has a recursive mode so that whole
directories (or whole filesystems) can be converted to text. This recursion
also works through archive files and compressed files like zip, tar, and gz
files.
- Download package
-
Checkout Package
osc -A https://api.opensuse.org checkout server:search/texterize && cd $_ - Create Badge
Refresh
Source Files
| Filename | Size | Changed |
|---|---|---|
| texterize-0.1.3.tar.bz2 | 0000726938 710 KB | |
| texterize-fixes.patch | 0000001232 1.2 KB | |
| texterize.spec | 0000003959 3.87 KB |
Comments 0