Package de.l3s.boilerpipe.filters.heuristics

The BoilerpipeFilters in this package are pure heuristics.

See:
          Description

Class Summary
BlockProximityFusion Fuses adjacent blocks if their distance (in blocks) does not exceed a certain limit.
DocumentTitleMatchClassifier Marks TextBlocks which contain parts of the HTML <TITLE> tag, using some heuristics which are quite specific to the news domain.
ExpandTitleToContentFilter Marks all TextBlocks "content" which are between the headline and the part that has already been marked content, if they are marked TextBlockLabel.MIGHT_BE_CONTENT.
KeepLargestBlockFilter Keeps the largest TextBlock only (by the number of words).
SimpleBlockFusionProcessor Merges two subsequent blocks if their text densities are equal.
 

Package de.l3s.boilerpipe.filters.heuristics Description

The BoilerpipeFilters in this package are pure heuristics.