A CRF-based word segmenter in Java. Supports Arabic and Chinese

Edit Package stanford-segmenter

Some languages require extensive token pre-processing, which is usually called segmentation.
The Stanford Word Segmenter currently supports Arabic and Chinese. The provided segmentation schemes have been found to work well for a variety of applications.

Refresh
Refresh
Source Files
Filename Size Changed
stanford-segmenter-3.2.0.tar.gz 0258877056 247 MB
stanford-segmenter.spec 0000001414 1.38 KB
Latest Revision
Comments 0
openSUSE Build Service is sponsored by