|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
---|---|
PatriciaTrie.KeyAnalyzer<K> | Defines the interface to analyze Trie keys on a bit
level. |
Trie<K,V> | Defines the interface for a prefix tree, an ordered tree data structure. |
Trie.Cursor<K,V> | An interface used by a Trie . |
TrieLoader<K,V> | Implementations creates new tries. |
TrieSource | |
TrieSource.TrieEntryListener |
Class Summary | |
---|---|
AbstractTrieLoader<K,V> | |
ByteArrayKeyAnalyzer | |
CharSequenceKeyAnalyzer | Analyzes CharSequence keys with case sensitivity. |
EmptyIterator | Provides an unmodifiable empty iterator. |
PatriciaTrie<K,V> | A PATRICIA Trie. |
TrieHandler | |
TrieSource.TrieEntry | Represents an entry in the Trie. |
TrieSource.TrieEntryEvent | |
TrieUtils | Miscellaneous utilities for Tries. |
UnmodifiableIterator<E> | A convenience class to aid in developing iterators that cannot be modified. |
XmlTrieLoader | The builder creates a trie from a simple XML file. |
Enum Summary | |
---|---|
Trie.Cursor.SelectStatus | The mode during selection. |
Contains the base structure for memory based entity recognition. The trie based on an PATRICIA implementation of the Limewire project. The implementation comes under the terms of version 3 of the GNU General Public License (GPL).
The main benefit for a memory based implementation for entity recognition ist the cluster structure of the Hadoop system. In such a system it is contra productive to have a central instance for entity recognition. Such a central system is always the bottleneck if it is under fire of a few hundrets of cluster node, each with X running instances. A PATRICIA trie structure consumes not as much main memory as other implementations.
To work with the trie in a cluster environment, use the service offered by
AbstractTrieLoader
. The default serialization format
is defined in XmlTrieLoader
. At this time the tries
key structure is based on CharSequences
.
This implementation is not as memory optimized as the
byte array
implementation. The byte array oriented key analyzer may use
CharSequences
transformed in UTF-8 bytes.
This safes memory for latin based languages.
For Hadoop use the distributed cache mechanism of Hadoop. See
net.sf.eos.hadoop
for further information.
net.sf.eos.hadoop
,
net.sf.eos.entity
,
net.sf.eos.hadoop.mapred.cooccurrence
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |