net.sf.eos.hadoop.mapred
Class EosDocumentSupportMapReduceBase

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by net.sf.eos.hadoop.mapred.EosDocumentSupportMapReduceBase
All Implemented Interfaces:
Closeable, JobConfigurable
Direct Known Subclasses:
DictionaryBasedEntityRecognizerMapper, DictionaryBasedEntityRecognizerReducer, IndexMapper, IndexReducer, SentencerMapper, SentencerReducer

public abstract class EosDocumentSupportMapReduceBase
extends MapReduceBase

Support for handling Map/Reduce jobs with EosDocument.

Author:
Sascha Kohlmann

Constructor Summary
EosDocumentSupportMapReduceBase()
           
 
Method Summary
 void close()
           
 void configure(JobConf conf)
           
protected  Text eosDocumentToText(EosDocument doc)
          Transforms a EosDocument to an Hadoop Text.
protected  Serializer getSerializer()
          Returns a Serializer instance.
protected  EosDocument textToEosDocument(Text eosDoc)
          Transforms a Hadoop Text to an EosDocument.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EosDocumentSupportMapReduceBase

public EosDocumentSupportMapReduceBase()
Method Detail

getSerializer

protected Serializer getSerializer()
                            throws EosException
Returns a Serializer instance. Uses the instance defined in Serializer.SERIALIZER_IMPL_CONFIG_NAME. If no configuration is defined, the default implementation is used.

Returns:
a Serializer instance
Throws:
EosException - if an error occurs

eosDocumentToText

protected Text eosDocumentToText(EosDocument doc)
                          throws IOException,
                                 Exception
Transforms a EosDocument to an Hadoop Text.

Parameters:
doc - the EosDocument to transform
Returns:
a serialized document
Throws:
Exception - if an error occurs
IOException - if an I/O error occurs

textToEosDocument

protected EosDocument textToEosDocument(Text eosDoc)
                                 throws Exception,
                                        IOException
Transforms a Hadoop Text to an EosDocument.

Parameters:
eosDoc - the document as Hadoop Text.
Returns:
a deserialized document
Throws:
Exception - if an error occurs
IOException - if an I/O error occurs

configure

public void configure(JobConf conf)
Specified by:
configure in interface JobConfigurable
Overrides:
configure in class MapReduceBase

close

public void close()
           throws IOException
Specified by:
close in interface Closeable
Overrides:
close in class MapReduceBase
Throws:
IOException


Copyright © 2008. All Rights Reserved.