A number of simple utilities for manipulating HTML and XML files

Edit Package html-xml-utils
https://www.w3.org/Tools/HTML-XML-utils/

HTML-XML-utils provides a number of simple utilities for manipulating and
converting HTML and XML files in various ways. The suite consists of the
following tools:

asc2xml - convert from UTF-8 to &\#nnn; entities
xml2asc - convert from &\#nnn; entities to UTF-8
hxaddid - add IDs to selected elements
hxcite - replace bibliographic references by hyperlinks
hxcite-mkbib - expand references and create bibliography
hxclean - apply heuristics to correct an HTML file
hxcopy - copy an HTML file while preserving relative links
hxcount - count elements and attributes in HTML or XML files
hxextract - extract selected elements
hxincl - expand included HTML or XML files
hxindex - create an alphabetically sorted index
hxmkbib - create bibliography from a template
hxmultitoc - create a table of contents for a set of HTML files
hxname2id - move some ID= or NAME= from A elements to their parents
hxnormalize - pretty-print an HTML file
hxnsxml - convert output of hxxmlns back to normal XML
hxnum - number section headings in an HTML file
hxpipe - convert XML to a format easier to parse with Perl or AWK
hxprintlinks - number links & add table of URLs at end of an HTML file
hxprune - remove marked elements from an HTML file
hxref - generate cross-references
hxselect - extract elements that match a (CSS) selector
hxtoc - insert a table of contents in an HTML file
hxuncdata - replace CDATA sections by character entities
hxunent - replace HTML predefined character entities to UTF-8
hxunpipe - convert output of pipe back to XML format
hxunxmlns - replace "global names" by XML Namespace prefixes
hxwls - list links in an HTML file
hxxmlns - replace XML Namespace prefixes by "global names"

Refresh
Refresh
Source Files
Filename Size Changed
html-xml-utils-7.8.tar.gz 0000408201 399 KB
html-xml-utils.changes 0000002372 2.32 KB
html-xml-utils.spec 0000001560 1.52 KB
Revision 5 (latest revision is 11)
Dominique Leuenberger's avatar Dominique Leuenberger (dimstar_suse) accepted request 743106 from Sebastian Wagner's avatar Sebastian Wagner (sebix) (revision 5)
- update to version 7.8:
 * textwrap.c, langinfo.c, hxnormalize.c: Added knowledge about
   languages that do not use spaces between words. In such languages,
   a newline should not be converted to a space in outc() in
   textwrap.c, but only to a break opportunity.
 * hxtoc.c: The element to group headings in HTML5 is called
   HGROUP, not HEADER. The heading of a section (SECTION, ARTICLE,
   etc.) need not be the first element, there may be non-header
   elements before it.
 * hxwls.c: Print "longdesc", "classid" or "codebase" in the second
   column for the corresponding attribute. Also recognize srcset
   (somewhat).
 * hxnormalize.c: Added option -X to indicate the input is XML
   instead of HTML. Handle conversion of CDATA elements to XML by
   escaping < and & instead of adding <![CDATA[. Added corresponding
   test normalize13.sh.
Comments 0
openSUSE Build Service is sponsored by