de.l3s.boilerpipe.filters.heuristics
Class KeepLargestBlockFilter

java.lang.Object
  extended by de.l3s.boilerpipe.filters.heuristics.KeepLargestBlockFilter
All Implemented Interfaces:
BoilerpipeFilter

public final class KeepLargestBlockFilter
extends java.lang.Object
implements BoilerpipeFilter

Keeps the largest TextBlock only (by the number of words). In case of more than one block with the same number of words, the first block is chosen. All discarded blocks are marked "not content" and flagged as DefaultLabels.MIGHT_BE_CONTENT.

Author:
Christian Kohlschütter

Field Summary
static KeepLargestBlockFilter INSTANCE
           
 
Constructor Summary
KeepLargestBlockFilter()
           
 
Method Summary
 boolean process(TextDocument doc)
          Processes the given document doc.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INSTANCE

public static final KeepLargestBlockFilter INSTANCE
Constructor Detail

KeepLargestBlockFilter

public KeepLargestBlockFilter()
Method Detail

process

public boolean process(TextDocument doc)
                throws BoilerpipeProcessingException
Description copied from interface: BoilerpipeFilter
Processes the given document doc.

Specified by:
process in interface BoilerpipeFilter
Parameters:
doc - The TextDocument that is to be processed.
Returns:
true if changes have been made to the TextDocument.
Throws:
BoilerpipeProcessingException