Class OrcSplitReader<T,BATCH>

java.lang.Object
org.apache.flink.orc.OrcSplitReader<T,BATCH>
All Implemented Interfaces:
Closeable, AutoCloseable
Direct Known Subclasses:
OrcColumnarRowSplitReader

public abstract class OrcSplitReader<T,BATCH> extends Object implements Closeable
Orc split reader to read record from orc file. The reader is only responsible for reading the data of a single split.
  • Field Details

  • Constructor Details

    • OrcSplitReader

      public OrcSplitReader(OrcShim<BATCH> shim, org.apache.hadoop.conf.Configuration conf, org.apache.orc.TypeDescription schema, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, int batchSize, org.apache.flink.core.fs.Path path, long splitStart, long splitLength) throws IOException
      Throws:
      IOException
  • Method Details

    • seekToRow

      public void seekToRow(long rowCount) throws IOException
      Seek to a particular row number.
      Throws:
      IOException
    • getRecordReader

      @VisibleForTesting public org.apache.orc.RecordReader getRecordReader()
    • reachedEnd

      public boolean reachedEnd() throws IOException
      Method used to check if the end of the input is reached.
      Returns:
      True if the end is reached, otherwise false.
      Throws:
      IOException - Thrown, if an I/O error occurred.
    • fillRows

      protected abstract int fillRows()
      Fills an ORC batch into an array of Row.
      Returns:
      The number of rows that were filled.
    • nextRecord

      public abstract T nextRecord(T reuse) throws IOException
      Reads the next record from the input.
      Parameters:
      reuse - Object that may be reused.
      Returns:
      Read record.
      Throws:
      IOException - Thrown, if an I/O error occurred.
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException