de.l3s.boilerpipe.sax
Class HTMLHighlighter

java.lang.Object
  extended by de.l3s.boilerpipe.sax.HTMLHighlighter

public final class HTMLHighlighter
extends java.lang.Object

Highlights text blocks in an HTML document that have been marked as "content" in the corresponding TextDocument.

Author:
Christian Kohlschütter

Constructor Summary
HTMLHighlighter(TextDocument doc, org.xml.sax.InputSource is)
          Prepares the HTMLHighlighter for the given TextDocument and the original HTML text (as an InputSource).
HTMLHighlighter(TextDocument doc, java.lang.String origHTML)
          Prepares the HTMLHighlighter for the given TextDocument and the original HTML text (as a String).
 
Method Summary
 java.lang.String getHTML()
          Returns the highlighted HTML code.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HTMLHighlighter

public HTMLHighlighter(TextDocument doc,
                       java.lang.String origHTML)
                throws BoilerpipeProcessingException
Prepares the HTMLHighlighter for the given TextDocument and the original HTML text (as a String).

Parameters:
doc - The processed TextDocument.
origHTML - The original HTML document.
Throws:
BoilerpipeProcessingException

HTMLHighlighter

public HTMLHighlighter(TextDocument doc,
                       org.xml.sax.InputSource is)
                throws BoilerpipeProcessingException
Prepares the HTMLHighlighter for the given TextDocument and the original HTML text (as an InputSource). Please remember to re-initialize the InputSource if you have used it already for creating the TextDocument.

Parameters:
doc - The processed TextDocument.
is - The original HTML document.
Throws:
BoilerpipeProcessingException
Method Detail

getHTML

public java.lang.String getHTML()
Returns the highlighted HTML code.

Returns: