|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object net.sf.eos.analyzer.TokenFilter net.sf.eos.entity.AbstractDictionaryBasedEntityRecognizer
public abstract class AbstractDictionaryBasedEntityRecognizer
An implementation of a @code EntityRecognizer} identifies entities
in a text. An entity may represented by an ID. The ID is a bracket around
a collection of literal entity terms or phrases. The ID is represented by the
value of a Map
entry. The entity literal is the value of
the key in the entry.
Field Summary | |
---|---|
static String |
ABSTRACT_DICTIONARY_BASED_ENTITY_RECOGNIZER_IMPL_CONFIG_NAME
The configuration key name for the classname of the factory. |
static String |
MAX_TOKEN_CONFIG_NAME
Key for the maximum token count. |
Fields inherited from interface net.sf.eos.entity.DictionaryBasedEntityRecognizer |
---|
ENTITY_ID_KEY |
Constructor Summary | |
---|---|
AbstractDictionaryBasedEntityRecognizer(Tokenizer source)
|
Method Summary | |
---|---|
void |
configure(Configuration config)
Set the configuration to be used by this object. |
protected Configuration |
getConfiguration()
Returns the configuration. |
Map<CharSequence,Set<CharSequence>> |
getEntityMap()
Return the entity map. |
int |
getMaxToken()
|
TextBuilder |
getTextBuilder()
Returns a setted builder. |
static DictionaryBasedEntityRecognizer |
newInstance(Tokenizer source)
Creates a new instance of a of the recognizer. |
static DictionaryBasedEntityRecognizer |
newInstance(Tokenizer source,
Configuration config)
Creates a new instance of a of the recognizer. |
void |
setEntityMap(Map<CharSequence,Set<CharSequence>> entities)
Set the entity map. |
void |
setMaxToken(int maxToken)
|
void |
setTextBuilder(TextBuilder builder)
Sets a builder. |
Methods inherited from class net.sf.eos.analyzer.TokenFilter |
---|
getSource, next |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
@ConfigurationKey(type=CLASSNAME, description="Implementations of a EntityRecognizer to identify entities in a text.") public static final String ABSTRACT_DICTIONARY_BASED_ENTITY_RECOGNIZER_IMPL_CONFIG_NAME
newInstance(Tokenizer, Configuration)
,
newInstance(Tokenizer)
,
Constant Field Values@ConfigurationKey(type=INTEGER, defaultValue="5", description="The maximum token count for indentifying.") public static final String MAX_TOKEN_CONFIG_NAME
Constructor Detail |
---|
public AbstractDictionaryBasedEntityRecognizer(Tokenizer source)
Method Detail |
---|
public void setEntityMap(Map<CharSequence,Set<CharSequence>> entities)
DictionaryBasedEntityRecognizer
setEntityMap
in interface DictionaryBasedEntityRecognizer
entities
- the entity mapTrie
public Map<CharSequence,Set<CharSequence>> getEntityMap()
DictionaryBasedEntityRecognizer
getEntityMap
in interface DictionaryBasedEntityRecognizer
null
public void setTextBuilder(TextBuilder builder)
DictionaryBasedEntityRecognizer
TextBuilder.SPACE_BUILDER
setted at construction time.
setTextBuilder
in interface DictionaryBasedEntityRecognizer
builder
- a builder to set or null
public TextBuilder getTextBuilder()
DictionaryBasedEntityRecognizer
getTextBuilder
in interface DictionaryBasedEntityRecognizer
null
.public int getMaxToken()
getMaxToken
in interface DictionaryBasedEntityRecognizer
public void setMaxToken(int maxToken)
setMaxToken
in interface DictionaryBasedEntityRecognizer
maxToken
- the maxToken to setpublic void configure(Configuration config)
Configurable
configure
in interface Configurable
config
- the configurationprotected final Configuration getConfiguration()
null
@FactoryMethod(key="net.sf.eos.entity.AbstractDictionaryBasedEntityRecognizer.impl", implementation=SimpleLongestMatchDictionaryBasedEntityRecognizer.class) public static final DictionaryBasedEntityRecognizer newInstance(Tokenizer source) throws EosException
SimpleLongestMatchDictionaryBasedEntityRecognizer
.
source
- a source tokenizer
EosException
- if it is not possible to instantiate an instance@FactoryMethod(key="net.sf.eos.entity.AbstractDictionaryBasedEntityRecognizer.impl", implementation=SimpleLongestMatchDictionaryBasedEntityRecognizer.class) public static final DictionaryBasedEntityRecognizer newInstance(Tokenizer source, Configuration config) throws EosException
Configuration
contains a key
ABSTRACT_DICTIONARY_BASED_ENTITY_RECOGNIZER_IMPL_CONFIG_NAME
a new instance of the classname in the value will instantiate. The
SimpleLongestMatchDictionaryBasedEntityRecognizer
will
instantiate if there is no value setted.
source
- a source tokenizerconfig
- the configuration
EosException
- if it is not possible to instantiate an instance
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |