Parsing and extracting information from (possibly malformed) HTML/XML documents

TagSoup is a library for parsing HTML/XML. It supports the HTML 5
specification, and can be used to parse either well-formed XML, or
unstructured and malformed HTML from the web. The library also provides
useful functions to extract information from an HTML document, making
it ideal for screen-scraping.

Users should start from the Text.HTML.TagSoup module.

Source Files
Filename Size Changed Actions
ghc-tagsoup.changes 0000003039 2.97 KB over 3 years
ghc-tagsoup.spec 0000002478 2.42 KB over 3 years
tagsoup-0.14.1.tar.gz 0000044031 43 KB over 3 years
Comments for ghc-tagsoup 0