|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Class Summary | |
---|---|
BlockProximityFusion | Fuses adjacent blocks if their distance (in blocks) does not exceed a certain limit. |
DocumentTitleMatchClassifier | Marks TextBlock s which contain parts of the HTML
<TITLE> tag, using some heuristics which are quite
specific to the news domain. |
ExpandTitleToContentFilter | Marks all TextBlock s "content" which are between the headline and the part that
has already been marked content, if they are marked DefaultLabels.MIGHT_BE_CONTENT . |
KeepLargestBlockFilter | Keeps the largest TextBlock only (by the number of words). |
SimpleBlockFusionProcessor | Merges two subsequent blocks if their text densities are equal. |
The BoilerpipeFilters in this package are pure heuristics.
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |