de.l3s.boilerpipe.document
Class TextDocumentStatistics

java.lang.Object
  extended by de.l3s.boilerpipe.document.TextDocumentStatistics

public final class TextDocumentStatistics
extends java.lang.Object

Provides shallow statistics on a given TextDocument

Author:
Christian Kohlschuetter

Constructor Summary
TextDocumentStatistics(TextDocument doc, boolean contentOnly)
          Computes statistics on a given TextDocument.
 
Method Summary
 float avgNumWords()
          Returns the average number of words at block-level (= overall number of words divided by the number of blocks).
 int getNumWords()
          Returns the overall number of words in all blocks.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextDocumentStatistics

public TextDocumentStatistics(TextDocument doc,
                              boolean contentOnly)
Computes statistics on a given TextDocument.

Parameters:
doc - The TextDocument.
contentOnly - if true then o
Method Detail

avgNumWords

public float avgNumWords()
Returns the average number of words at block-level (= overall number of words divided by the number of blocks).

Returns:
Average

getNumWords

public int getNumWords()
Returns the overall number of words in all blocks.

Returns:
Sum