net.sf.eos.entity
Class SimpleLongestMatchDictionaryBasedEntityRecognizer

java.lang.Object
  extended by net.sf.eos.analyzer.TokenFilter
      extended by net.sf.eos.entity.AbstractDictionaryBasedEntityRecognizer
          extended by net.sf.eos.entity.SimpleLongestMatchDictionaryBasedEntityRecognizer
All Implemented Interfaces:
Tokenizer, Configurable, DictionaryBasedEntityRecognizer, EntityRecognizer

public class SimpleLongestMatchDictionaryBasedEntityRecognizer
extends AbstractDictionaryBasedEntityRecognizer

A simple matcher for named entities. The implementation slices Tokens of a defined maximum length thru the recognizer. If a token combination matches a key in the entity map, a new Token of type EntityRecognizer.ENTITY_TYPE is created and return by next().

Author:
Sascha Kohlmann

Field Summary
 
Fields inherited from class net.sf.eos.entity.AbstractDictionaryBasedEntityRecognizer
ABSTRACT_DICTIONARY_BASED_ENTITY_RECOGNIZER_IMPL_CONFIG_NAME, MAX_TOKEN_CONFIG_NAME
 
Fields inherited from interface net.sf.eos.entity.DictionaryBasedEntityRecognizer
ENTITY_ID_KEY
 
Constructor Summary
SimpleLongestMatchDictionaryBasedEntityRecognizer(Tokenizer source)
          Creates a new instance.
 
Method Summary
 Token next()
          Returned Token may be of type EntityRecognizer.ENTITY_TYPE or any different type.
 
Methods inherited from class net.sf.eos.entity.AbstractDictionaryBasedEntityRecognizer
configure, getConfiguration, getEntityMap, getMaxToken, getTextBuilder, newInstance, newInstance, setEntityMap, setMaxToken, setTextBuilder
 
Methods inherited from class net.sf.eos.analyzer.TokenFilter
getSource
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleLongestMatchDictionaryBasedEntityRecognizer

public SimpleLongestMatchDictionaryBasedEntityRecognizer(Tokenizer source)
Creates a new instance.

Parameters:
source - the source tokenizer
Method Detail

next

public Token next()
           throws TokenizerException
Returned Token may be of type EntityRecognizer.ENTITY_TYPE or any different type.

Specified by:
next in interface Tokenizer
Specified by:
next in class TokenFilter
Returns:
the next token or null
Throws:
IllegalStateException - if AbstractDictionaryBasedEntityRecognizer.getEntityMap() returns null
TokenizerException - if an error occurs
See Also:
Tokenizer.next()


Copyright © 2008. All Rights Reserved.