Java library for working with HTML

http://jsoup.org/

jsoup is a Java library for working with HTML.
It provides an API for extracting and manipulating data,
using DOM, CSS, and jquery-like methods.

jsoup implements the WHATWG HTML5 specification.

- scrapes and parses HTML from a URL, file, or string
- finds and extracts data, using DOM traversal or CSS selectors
- manipulates the HTML elements, attributes, and text
- cleans user-submitted content against a safe white-list,
to prevent XSS attacks
- outputs tidied HTML

jsoup can deal with invalid HTML tag soup.

Refresh
Refresh
Source Files
Filename Size Changed
generate-tarball.sh 0000000480 480 Bytes over 1 year
jsoup-1.11.3.tar.gz 0000242975 237 KB over 1 year
jsoup-build.xml 0000007154 6.99 KB over 1 year
jsoup.changes 0000000578 578 Bytes over 1 year
jsoup.spec 0000002781 2.72 KB over 1 year
Comments for jsoup 0