net.sf.eos.hadoop.mapred.cooccurrence
Class DictionaryBasedEntityIdKeyGenerator
java.lang.Object
net.sf.eos.config.Configured
net.sf.eos.hadoop.mapred.cooccurrence.DictionaryBasedEntityIdKeyGenerator
- All Implemented Interfaces:
- Configurable
public class DictionaryBasedEntityIdKeyGenerator
- extends Configured
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DictionaryBasedEntityIdKeyGenerator
public DictionaryBasedEntityIdKeyGenerator()
createKeysForDocument
public Map<Text,EosDocument> createKeysForDocument(EosDocument doc)
throws EosException
- Throws:
EosException
getDictionaryBasedEntityRecognizerForText
protected DictionaryBasedEntityRecognizer getDictionaryBasedEntityRecognizerForText(CharSequence text)
- Creates a new
DictionaryBasedEntityRecognizer
for the
given text. Uses the factory method of
AbstractDictionaryBasedEntityRecognizer.newInstance(net.sf.eos.analyzer.Tokenizer, Configuration)
to create a new instance. Use getTokenizer()
for the
source.
- Parameters:
text
- the text to tokenize
- Returns:
- a new instance
getTokenizer
protected ResettableTokenizer getTokenizer()
throws TokenizerException
- Returns a
Tokenizer
as source for the
recognizer.
- Returns:
- the source for the recognizer
- Throws:
TokenizerException
- if an error occurs
getTrie
public Trie<CharSequence,Set<CharSequence>> getTrie()
setTrie
public void setTrie(Trie<CharSequence,Set<CharSequence>> trie)
Copyright © 2008. All Rights Reserved.