Class OrcShimV200

java.lang.Object
org.apache.flink.orc.shim.OrcShimV200
All Implemented Interfaces:
Serializable, OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
Direct Known Subclasses:
OrcShimV210

public class OrcShimV200 extends Object implements OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
Shim orc for Hive version 2.0.0 and upper versions.
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static boolean[]
    computeProjectionMask(org.apache.orc.TypeDescription schema, int[] selectedFields)
    Computes the ORC projection mask of the fields to include from the selected fields.rowOrcInputFormat.nextRecord(null).
    createBatchWrapper(org.apache.orc.TypeDescription schema, int batchSize)
     
    protected org.apache.orc.Reader
    createReader(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf)
     
    org.apache.orc.RecordReader
    createRecordReader(org.apache.hadoop.conf.Configuration conf, org.apache.orc.TypeDescription schema, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, org.apache.flink.core.fs.Path path, long splitStart, long splitLength)
    Create orc RecordReader from conf, schema and etc...
    protected org.apache.orc.RecordReader
    createRecordReader(org.apache.orc.Reader reader, org.apache.orc.Reader.Options options)
     
    static org.apache.flink.api.java.tuple.Tuple2<Long,Long>
    getOffsetAndLengthForSplit(long splitStart, long splitLength, List<org.apache.orc.StripeInformation> stripes)
     
    boolean
    nextBatch(org.apache.orc.RecordReader reader, org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch rowBatch)
    Read the next row batch.
    protected org.apache.orc.Reader.Options
    readOrcConf(org.apache.orc.Reader.Options options, org.apache.hadoop.conf.Configuration conf)
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • OrcShimV200

      public OrcShimV200()
  • Method Details

    • createReader

      protected org.apache.orc.Reader createReader(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf) throws IOException
      Throws:
      IOException
    • createRecordReader

      protected org.apache.orc.RecordReader createRecordReader(org.apache.orc.Reader reader, org.apache.orc.Reader.Options options) throws IOException
      Throws:
      IOException
    • readOrcConf

      protected org.apache.orc.Reader.Options readOrcConf(org.apache.orc.Reader.Options options, org.apache.hadoop.conf.Configuration conf)
    • createRecordReader

      public org.apache.orc.RecordReader createRecordReader(org.apache.hadoop.conf.Configuration conf, org.apache.orc.TypeDescription schema, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, org.apache.flink.core.fs.Path path, long splitStart, long splitLength) throws IOException
      Description copied from interface: OrcShim
      Create orc RecordReader from conf, schema and etc...
      Specified by:
      createRecordReader in interface OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
      Throws:
      IOException
    • createBatchWrapper

      public HiveOrcBatchWrapper createBatchWrapper(org.apache.orc.TypeDescription schema, int batchSize)
      Specified by:
      createBatchWrapper in interface OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
    • nextBatch

      public boolean nextBatch(org.apache.orc.RecordReader reader, org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch rowBatch) throws IOException
      Description copied from interface: OrcShim
      Read the next row batch.
      Specified by:
      nextBatch in interface OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch>
      Throws:
      IOException
    • getOffsetAndLengthForSplit

      public static org.apache.flink.api.java.tuple.Tuple2<Long,Long> getOffsetAndLengthForSplit(long splitStart, long splitLength, List<org.apache.orc.StripeInformation> stripes)
    • computeProjectionMask

      public static boolean[] computeProjectionMask(org.apache.orc.TypeDescription schema, int[] selectedFields)
      Computes the ORC projection mask of the fields to include from the selected fields.rowOrcInputFormat.nextRecord(null).
      Returns:
      The ORC projection mask.