de.l3s.boilerpipe.filters.english
Class IgnoreBlocksAfterContentFromEndFilter

java.lang.Object
  extended by de.l3s.boilerpipe.filters.english.IgnoreBlocksAfterContentFromEndFilter
All Implemented Interfaces:
BoilerpipeFilter

public final class IgnoreBlocksAfterContentFromEndFilter
extends java.lang.Object
implements BoilerpipeFilter

Marks all blocks as "non-content" that occur after blocks that have been marked DefaultLabels.INDICATES_END_OF_TEXT, and after any content block. This filter can be used in conjunction with an upstream TerminatingBlocksFinder.

Author:
Christian Kohlschütter
See Also:
TerminatingBlocksFinder

Field Summary
static IgnoreBlocksAfterContentFromEndFilter INSTANCE
           
 
Method Summary
protected static int getNumFullTextWords(TextBlock tb)
           
protected static int getNumFullTextWords(TextBlock tb, float minTextDensity)
           
 boolean process(TextDocument doc)
          Processes the given document doc.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INSTANCE

public static final IgnoreBlocksAfterContentFromEndFilter INSTANCE
Method Detail

process

public boolean process(TextDocument doc)
                throws BoilerpipeProcessingException
Description copied from interface: BoilerpipeFilter
Processes the given document doc.

Specified by:
process in interface BoilerpipeFilter
Parameters:
doc - The TextDocument that is to be processed.
Returns:
true if changes have been made to the TextDocument.
Throws:
BoilerpipeProcessingException

getNumFullTextWords

protected static int getNumFullTextWords(TextBlock tb)

getNumFullTextWords

protected static int getNumFullTextWords(TextBlock tb,
                                         float minTextDensity)