uk.ac.shef.dcs.oak.jate.util.control
Class Normalizer

java.lang.Object
  extended by uk.ac.shef.dcs.oak.jate.util.control.Normalizer
Direct Known Subclasses:
Lemmatizer

public abstract class Normalizer
extends java.lang.Object

Normalizer returns text units to its canonical forms


Constructor Summary
Normalizer()
           
 
Method Summary
abstract  java.lang.String normalize(java.lang.String unit)
          Normalise only the RHS head word of the input text unit
abstract  java.lang.String normalizeContent(java.lang.String content)
          Normalise every token found in the input content, assuming tokens are delimited by a whitespace character.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Normalizer

public Normalizer()
Method Detail

normalize

public abstract java.lang.String normalize(java.lang.String unit)
Normalise only the RHS head word of the input text unit

Parameters:
unit - the variant form of a single text unit, e.g., word, phrase
Returns:
the normalised canonical form of input

normalizeContent

public abstract java.lang.String normalizeContent(java.lang.String content)
Normalise every token found in the input content, assuming tokens are delimited by a whitespace character.

Parameters:
content -
Returns: