Parsing and extracting information from (possibly malformed) HTML/XML documents
TagSoup is a library for parsing HTML/XML. It supports the HTML 5 specification, and can be used to parse either well-formed XML, or unstructured and malformed HTML from the web. The library also provides useful functions to extract information from an HTML document, making it ideal for screen-scraping. Users should start from the Text.HTML.TagSoup module.
Source Files (show merged sources derived from linked package)
|_link||0000000124124 Bytes||1540448340about 2 months ago|
|ghc-tagsoup.changes||00000041304.03 KB||15391997302 months ago|
|ghc-tagsoup.spec||00000026212.56 KB||15391987422 months ago|
|tagsoup-0.14.7.tar.gz||000004395642.9 KB||15383817362 months ago|