de.l3s.boilerpipe
Interface BoilerpipeFilter

All Known Subinterfaces:
BoilerpipeExtractor
All Known Implementing Classes:
AddPrecedingLabelsFilter, ArticleExtractor, ArticleMetadataFilter, ArticleSentencesExtractor, BlockProximityFusion, BoilerplateBlockFilter, CanolaExtractor, ContentFusion, DefaultExtractor, DensityRulesClassifier, DocumentTitleMatchClassifier, ExpandTitleToContentFilter, ExtractorBase, IgnoreBlocksAfterContentFilter, IgnoreBlocksAfterContentFromEndFilter, InvertedFilter, KeepEverythingExtractor, KeepEverythingWithMinKWordsExtractor, KeepLargestBlockFilter, KeepLargestFulltextBlockFilter, LabelFusion, LabelToBoilerplateFilter, LabelToContentFilter, LargestContentExtractor, MarkEverythingContentFilter, MinClauseWordsFilter, MinFulltextWordsFilter, MinWordsFilter, NumWordsRulesClassifier, NumWordsRulesExtractor, SimpleBlockFusionProcessor, SplitParagraphBlocksFilter, SurroundingToContentFilter, TerminatingBlocksFinder

public interface BoilerpipeFilter

A generic BoilerpipeFilter. Takes a TextDocument and processes it somehow.

Author:
Christian Kohlschütter

Method Summary
 boolean process(TextDocument doc)
          Processes the given document doc.
 

Method Detail

process

boolean process(TextDocument doc)
                throws BoilerpipeProcessingException
Processes the given document doc.

Parameters:
doc - The TextDocument that is to be processed.
Returns:
true if changes have been made to the TextDocument.
Throws:
BoilerpipeProcessingException