uk.ac.shef.dcs.oak.jate.core.algorithm
Class TermExFeatureWrapper
java.lang.Object
uk.ac.shef.dcs.oak.jate.core.algorithm.AbstractFeatureWrapper
uk.ac.shef.dcs.oak.jate.core.algorithm.TermExFeatureWrapper
public class TermExFeatureWrapper
- extends AbstractFeatureWrapper
TermExFeatureWrapper wraps an instance of FeatureDocumentTermFrequency, which tells a candidate term's distribution over a corpus,
each document in the corpus, and existence in documents;
another instance of FeatureCorpusTermFrequency which tells individual words' distributions over corpus;
and an instance of FeatureRefCorpusTermFrequency, which tells individual words' distributions in a reference corpus.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
TermExFeatureWrapper
public TermExFeatureWrapper(FeatureDocumentTermFrequency termfreq,
FeatureCorpusTermFrequency wordfreq,
FeatureRefCorpusTermFrequency ref)
- Default constructor
- Parameters:
termfreq
- wordfreq
- ref
-
getTotalCorpusTermFreq
public int getTotalCorpusTermFreq()
- Returns:
- total number of occurrences of terms in the corpus
getTermFreq
public int getTermFreq(java.lang.String term)
- Parameters:
term
-
- Returns:
- the number of occurrences of a candidate term in the corpus
getTermFreqInDoc
public int getTermFreqInDoc(java.lang.String term,
int d)
- Parameters:
term
- d
-
- Returns:
- the term's frequency in document with id=d
getTermAppear
public int[] getTermAppear(java.lang.String t)
- Parameters:
t
-
- Returns:
- the ids of documents in which term t is found
getSumTermFreqInDocs
public int getSumTermFreqInDocs(java.lang.String term)
- Parameters:
term
-
- Returns:
- total number of occurrences of a term in the documents in which it is found
getNormFreqInDoc
public double getNormFreqInDoc(java.lang.String t,
int d)
- Parameters:
t
- d
-
- Returns:
- normalised term frequency in a document with id=d. It is equal to freq of term t in d divided by
total term frequency in d.
getWordFreq
public int getWordFreq(java.lang.String word)
- Parameters:
word
-
- Returns:
- the number of occurrences of a word in the corpus
getRefWordFreqNorm
public double getRefWordFreqNorm(java.lang.String word)
- Parameters:
word
-
- Returns:
- the normalised frequency of a word in the reference corpus. It is equal to freq of word w divided by
total frequencies
getTerms
public java.util.Set<java.lang.String> getTerms()
- Specified by:
getTerms
in class AbstractFeatureWrapper
- Returns:
- set of candidate term strings