net.sf.eos.sentence
Class DefaultSentencer
java.lang.Object
net.sf.eos.config.Configured
net.sf.eos.sentence.Sentencer
net.sf.eos.sentence.DefaultSentencer
- All Implemented Interfaces:
- Configurable
public class DefaultSentencer
- extends Sentencer
Simple default implementation.
- Author:
- Sascha Kohlmann
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
DefaultSentencer
public DefaultSentencer()
- Creates a new instance.
toSentenceDocuments
public Map<String,EosDocument> toSentenceDocuments(EosDocument doc,
SentenceTokenizer sentencer,
ResettableTokenizer tokenizer,
TextBuilder builder)
throws EosException
- Description copied from class:
Sentencer
- Fragments a document into documents of sentences. The return value is
a map of message digests and sentenced
document. The documents of the return value has all metada data of the
original document and maybe additional metadata.
- Specified by:
toSentenceDocuments
in class Sentencer
- Parameters:
doc
- the document to fragmentsentencer
- a sentencer instancetokenizer
- a tokenizer instance to tokenize the result of the
sentencerbuilder
- the builder supports the rebuilding of the
tokenizer
- Returns:
- a map of message digest -> document relations
- Throws:
EosException
- if an error occurs
Copyright © 2008. All Rights Reserved.