HTML parser based on the WHAT-WG Web Applications

HTML parser designed to follow the WHATWG HTML5 specification. The parser is designed to handle all
flavours of HTML and parses invalid documents using well-defined error handling rules compatible
with the behaviour of major desktop web browsers.
Output is to a tree structure; the current release supports output to ElementTree (including
cElementTree and lxml.etree), minidom, and a custom simpletree format.
html5lib also includes a HTML sanitizer, "treewalkers" for converting various tree formats into
streams and filters and serializers to operate on those streams.

James Graham

Source Files (show merged sources derived from linked package)
Filename Size Changed Actions
html5lib-1.0.1.tar.gz 0000252959 247 KB almost 2 years
python-html5lib.changes 0000005213 5.09 KB 11 months
python-html5lib.spec 0000002579 2.52 KB 11 months
Comments for python-html5lib 0