HTML parser based on the WHAT-WG Web Applications

HTML parser designed to follow the WHATWG HTML5 specification. The parser is designed to handle all
flavours of HTML and parses invalid documents using well-defined error handling rules compatible
with the behaviour of major desktop web browsers.
Output is to a tree structure; the current release supports output to ElementTree (including
cElementTree and lxml.etree), minidom, and a custom simpletree format.
html5lib also includes a HTML sanitizer, "treewalkers" for converting various tree formats into
streams and filters and serializers to operate on those streams.

James Graham

Source Files (show merged sources derived from linked package)
Filename Size Changed Actions
coerce_comments_to_work_with_lxml.patch 0000002717 2.65 KB almost 4 years
html5lib-0.9999999.tar.gz 0000889312 868 KB almost 4 years
python-html5lib.changes 0000004143 4.05 KB almost 4 years
python-html5lib.spec 0000002771 2.71 KB almost 4 years
Comments for python-html5lib 0