org.apache.hadoop.streaming
Class StreamBaseRecordReader

java.lang.Object
  extended by org.apache.hadoop.streaming.StreamBaseRecordReader
All Implemented Interfaces:
org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Direct Known Subclasses:
StreamXmlRecordReader

public abstract class StreamBaseRecordReader
extends Object
implements org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

Shared functionality for hadoopStreaming formats. A custom reader can be defined to be a RecordReader with the constructor below and is selected with the option bin/hadoopStreaming -inputreader ...

See Also:
StreamXmlRecordReader

Field Summary
protected static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
StreamBaseRecordReader(org.apache.hadoop.fs.FSDataInputStream in, org.apache.hadoop.mapred.FileSplit split, org.apache.hadoop.mapred.Reporter reporter, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.fs.FileSystem fs)
           
 
Method Summary
 void close()
          Close this to future operations.
 org.apache.hadoop.io.Text createKey()
           
 org.apache.hadoop.io.Text createValue()
           
 long getPos()
          Returns the current position in the input.
 float getProgress()
           
abstract  boolean next(org.apache.hadoop.io.Text key, org.apache.hadoop.io.Text value)
          Read a record.
abstract  void seekNextRecordBoundary()
          Implementation should seek forward in_ to the first byte of the next record.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

protected static final org.apache.commons.logging.Log LOG
Constructor Detail

StreamBaseRecordReader

public StreamBaseRecordReader(org.apache.hadoop.fs.FSDataInputStream in,
                              org.apache.hadoop.mapred.FileSplit split,
                              org.apache.hadoop.mapred.Reporter reporter,
                              org.apache.hadoop.mapred.JobConf job,
                              org.apache.hadoop.fs.FileSystem fs)
                       throws IOException
Throws:
IOException
Method Detail

next

public abstract boolean next(org.apache.hadoop.io.Text key,
                             org.apache.hadoop.io.Text value)
                      throws IOException
Read a record. Implementation should call numRecStats at the end

Specified by:
next in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Throws:
IOException

getPos

public long getPos()
            throws IOException
Returns the current position in the input.

Specified by:
getPos in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Throws:
IOException

close

public void close()
           throws IOException
Close this to future operations.

Specified by:
close in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Throws:
IOException

getProgress

public float getProgress()
                  throws IOException
Specified by:
getProgress in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
Throws:
IOException

createKey

public org.apache.hadoop.io.Text createKey()
Specified by:
createKey in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

createValue

public org.apache.hadoop.io.Text createValue()
Specified by:
createValue in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>

seekNextRecordBoundary

public abstract void seekNextRecordBoundary()
                                     throws IOException
Implementation should seek forward in_ to the first byte of the next record. The initial byte offset in the stream is arbitrary.

Throws:
IOException


Copyright © 2014 Apache Software Foundation. All Rights Reserved.