de.l3s.boilerpipe.filters.heuristics
Class BlockProximityFusion

java.lang.Object
  extended by de.l3s.boilerpipe.filters.heuristics.BlockProximityFusion
All Implemented Interfaces:
BoilerpipeFilter

public final class BlockProximityFusion
extends java.lang.Object
implements BoilerpipeFilter

Fuses adjacent blocks if their distance (in blocks) does not exceed a certain limit. This probably makes sense only in cases where an upstream filter already has removed some blocks.

Author:
Christian Kohlschütter

Field Summary
static BlockProximityFusion MAX_DISTANCE_1
           
static BlockProximityFusion MAX_DISTANCE_1_CONTENT_ONLY
           
static BlockProximityFusion MAX_DISTANCE_1_CONTENT_ONLY_SAME_TAGLEVEL
           
static BlockProximityFusion MAX_DISTANCE_1_SAME_TAGLEVEL
           
 
Constructor Summary
BlockProximityFusion(int maxBlocksDistance, boolean contentOnly, boolean sameTagLevelOnly)
          Creates a new BlockProximityFusion instance.
 
Method Summary
 boolean process(TextDocument doc)
          Processes the given document doc.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MAX_DISTANCE_1

public static final BlockProximityFusion MAX_DISTANCE_1

MAX_DISTANCE_1_SAME_TAGLEVEL

public static final BlockProximityFusion MAX_DISTANCE_1_SAME_TAGLEVEL

MAX_DISTANCE_1_CONTENT_ONLY

public static final BlockProximityFusion MAX_DISTANCE_1_CONTENT_ONLY

MAX_DISTANCE_1_CONTENT_ONLY_SAME_TAGLEVEL

public static final BlockProximityFusion MAX_DISTANCE_1_CONTENT_ONLY_SAME_TAGLEVEL
Constructor Detail

BlockProximityFusion

public BlockProximityFusion(int maxBlocksDistance,
                            boolean contentOnly,
                            boolean sameTagLevelOnly)
Creates a new BlockProximityFusion instance.

Parameters:
maxBlocksDistance - The maximum distance in blocks.
contentOnly -
Method Detail

process

public boolean process(TextDocument doc)
                throws BoilerpipeProcessingException
Description copied from interface: BoilerpipeFilter
Processes the given document doc.

Specified by:
process in interface BoilerpipeFilter
Parameters:
doc - The TextDocument that is to be processed.
Returns:
true if changes have been made to the TextDocument.
Throws:
BoilerpipeProcessingException