Parser Class

A simple html parser used to extract blocks of html from a document.

Constructors

public Parser( String html )

Methods

getElementByAttributes( String tagName, String attributeName, String attributeValue ) returns javaxt.html.Element
Returns the first HTML Element found in the HTML document with given tag name and attribute. Returns null if an element was not found.
getElementByID( String id ) returns javaxt.html.Element
Returns an HTML Element with a given id. Returns null if the element was not found.
getElementByTagName( String tagName ) returns javaxt.html.Element
Returns the first HTML Element found in the HTML document with given tag name. Returns null if an element was not found.
getElements( String tagName, String attributeName, String attributeValue ) returns javaxt.html.Element[]
Returns an array of HTML Elements found in the HTML document with given tag name, attribute, and attribute value (e.g. "div", "class", "hdr2").
getElementsByTagName( String tagName ) returns javaxt.html.Element[]
Returns an array of HTML Elements found in the HTML document with given tag name.
getHTML( ) returns String
getImageLinks( ) returns String[]
Returns a list of links to images. The links may include relative paths. Use the getAbsolutePath method to resolve the relative paths to a fully qualified url.
MapPath( String relPath, java.net.URL url ) returns String
Returns a fully qualified URL for a given path. Returns null if the function fails to resolve the path.
relPathRelative path to a file (e.g. "../images/header.jpg")
urlURL that is sourcing the relPath (e.g. "http://acme.com/about/")
setHTML( String html ) returns void
Used to reset the "scope" of the parser
stripHTMLTags( String html ) returns String
Used to remove any html tags from a block of text