A number of simple utilities for manipulating HTML and XML files
HTML-XML-utils provides a number of simple utilities for manipulating and
converting HTML and XML files in various ways. The suite consists of the
following tools:
asc2xml - convert from UTF-8 to &\#nnn; entities
xml2asc - convert from &\#nnn; entities to UTF-8
hxaddid - add IDs to selected elements
hxcite - replace bibliographic references by hyperlinks
hxcite-mkbib - expand references and create bibliography
hxclean - apply heuristics to correct an HTML file
hxcopy - copy an HTML file while preserving relative links
hxcount - count elements and attributes in HTML or XML files
hxextract - extract selected elements
hxincl - expand included HTML or XML files
hxindex - create an alphabetically sorted index
hxmkbib - create bibliography from a template
hxmultitoc - create a table of contents for a set of HTML files
hxname2id - move some ID= or NAME= from A elements to their parents
hxnormalize - pretty-print an HTML file
hxnsxml - convert output of hxxmlns back to normal XML
hxnum - number section headings in an HTML file
hxpipe - convert XML to a format easier to parse with Perl or AWK
hxprintlinks - number links & add table of URLs at end of an HTML file
hxprune - remove marked elements from an HTML file
hxref - generate cross-references
hxselect - extract elements that match a (CSS) selector
hxtoc - insert a table of contents in an HTML file
hxuncdata - replace CDATA sections by character entities
hxunent - replace HTML predefined character entities to UTF-8
hxunpipe - convert output of pipe back to XML format
hxunxmlns - replace "global names" by XML Namespace prefixes
hxwls - list links in an HTML file
hxxmlns - replace XML Namespace prefixes by "global names"
- Developed at utilities
- Sources inherited from project openSUSE:Factory
-
1
derived packages
- Download package
-
Checkout Package
osc -A https://api.opensuse.org checkout openSUSE:Backports:SLE-15-SP4:FactoryCandidates/html-xml-utils && cd $_
- Create Badge
Source Files
Filename | Size | Changed |
---|---|---|
html-xml-utils-7.8.tar.gz | 0000408201 399 KB | |
html-xml-utils.changes | 0000002372 2.32 KB | |
html-xml-utils.spec | 0000001560 1.52 KB |
Revision 5 (latest revision is 11)
- update to version 7.8: * textwrap.c, langinfo.c, hxnormalize.c: Added knowledge about languages that do not use spaces between words. In such languages, a newline should not be converted to a space in outc() in textwrap.c, but only to a break opportunity. * hxtoc.c: The element to group headings in HTML5 is called HGROUP, not HEADER. The heading of a section (SECTION, ARTICLE, etc.) need not be the first element, there may be non-header elements before it. * hxwls.c: Print "longdesc", "classid" or "codebase" in the second column for the corresponding attribute. Also recognize srcset (somewhat). * hxnormalize.c: Added option -X to indicate the input is XML instead of HTML. Handle conversion of CDATA elements to XML by escaping < and & instead of adding <![CDATA[. Added corresponding test normalize13.sh.
Comments 0