Package de.l3s.boilerpipe.sax

Classes related to parsing and producing HTML from/to Boilerpipe TextDocuments.

See:
          Description

Interface Summary
InputSourceable An InputSourceable can return an arbitrary number of new InputSources for a given document.
TagAction Defines an action that is to be performed whenever a particular tag occurs during HTML parsing.
 

Class Summary
BoilerpipeHTMLContentHandler A simple SAX ContentHandler, used by BoilerpipeSAXInput.
BoilerpipeHTMLParser A simple SAX Parser, used by BoilerpipeSAXInput.
BoilerpipeSAXInput Parses an InputSource using SAX and returns a TextDocument.
CommonTagActions Defines an action that is to be performed whenever a particular tag occurs during HTML parsing.
CommonTagActions.BlockTagLabelAction CommonTagActions for block-level elements, which triggers some LabelAction on the generated TextBlock.
CommonTagActions.Chained  
CommonTagActions.InlineTagLabelAction CommonTagActions for inline elements, which triggers some LabelAction on the generated TextBlock.
DefaultTagActionMap Default TagActions.
HTMLDocument An InputSourceable for HTMLFetcher.
HTMLFetcher A very simple HTTP/HTML fetcher, really just for demo purposes.
HTMLHighlighter Highlights text blocks in an HTML document that have been marked as "content" in the corresponding TextDocument.
TagActionMap Base class for definition a set of TagActions that are to be used for the HTML parsing process.
 

Package de.l3s.boilerpipe.sax Description

Classes related to parsing and producing HTML from/to Boilerpipe TextDocuments.