de.l3s.boilerpipe.extractors
Class ArticleExtractor

java.lang.Object
  extended by de.l3s.boilerpipe.extractors.ExtractorBase
      extended by de.l3s.boilerpipe.extractors.ArticleExtractor
All Implemented Interfaces:
BoilerpipeExtractor, BoilerpipeFilter

public final class ArticleExtractor
extends ExtractorBase

A full-text extractor which is tuned towards news articles. In this scenario it achieves higher accuracy than DefaultExtractor.

Author:
Christian Kohlschütter

Field Summary
static ArticleExtractor INSTANCE
           
 
Constructor Summary
ArticleExtractor()
           
 
Method Summary
static ArticleExtractor getInstance()
          Returns the singleton instance for ArticleExtractor.
 boolean process(TextDocument doc)
          Processes the given document doc.
 
Methods inherited from class de.l3s.boilerpipe.extractors.ExtractorBase
getText, getText, getText, getText, getText
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INSTANCE

public static final ArticleExtractor INSTANCE
Constructor Detail

ArticleExtractor

public ArticleExtractor()
Method Detail

getInstance

public static ArticleExtractor getInstance()
Returns the singleton instance for ArticleExtractor.


process

public boolean process(TextDocument doc)
                throws BoilerpipeProcessingException
Description copied from interface: BoilerpipeFilter
Processes the given document doc.

Parameters:
doc - The TextDocument that is to be processed.
Returns:
true if changes have been made to the TextDocument.
Throws:
BoilerpipeProcessingException