|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Class Summary | |
---|---|
ArticleExtractor | A full-text extractor which is tuned towards news articles. |
ArticleSentencesExtractor | A full-text extractor which is tuned towards extracting sentences from news articles. |
DefaultExtractor | A quite generic full-text extractor. |
ExtractorBase | The base class of Extractors. |
KeepEverythingExtractor | Marks everything as content. |
KeepEverythingWithMinKWordsExtractor | A full-text extractor which extracts the largest text component of a page. |
LargestContentExtractor | A full-text extractor which extracts the largest text component of a page. |
NumWordsRulesExtractor | A quite generic full-text extractor solely based upon the number of words per block (the current, the previous and the next block). |
This package contains some standard extractors (i.e., completely piped BoilerpipeFilters)
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |