Class AbstractOrcFileInputFormat<T,BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>

java.lang.Object
org.apache.flink.orc.AbstractOrcFileInputFormat<T,BatchT,SplitT>
Type Parameters:
T - The type of records produced by this reader format.
All Implemented Interfaces:
Serializable, org.apache.flink.api.java.typeutils.ResultTypeQueryable<T>, org.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT>
Direct Known Subclasses:
OrcColumnarRowInputFormat

public abstract class AbstractOrcFileInputFormat<T,BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit> extends Object implements org.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT>
The base for ORC readers for the FileSource. Implements the reader initialization, vectorized reading, and pooling of column vector objects.

Subclasses implement the conversion to the specific result record(s) that they return by creating via extending AbstractOrcFileInputFormat.OrcReaderBatch.

See Also:
  • Field Details

    • shim

      protected final OrcShim<BatchT> shim
    • hadoopConfigWrapper

      protected final SerializableHadoopConfigWrapper hadoopConfigWrapper
    • schema

      protected final org.apache.orc.TypeDescription schema
    • selectedFields

      protected final int[] selectedFields
    • conjunctPredicates

      protected final List<OrcFilters.Predicate> conjunctPredicates
    • batchSize

      protected final int batchSize
  • Constructor Details

    • AbstractOrcFileInputFormat

      protected AbstractOrcFileInputFormat(OrcShim<BatchT> shim, org.apache.hadoop.conf.Configuration hadoopConfig, org.apache.orc.TypeDescription schema, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, int batchSize)
      Parameters:
      shim - the shim for various Orc dependent versions. If you use the latest version, please use OrcShim.defaultShim() directly.
      hadoopConfig - the hadoop config for orc reader.
      schema - the full schema of orc format.
      selectedFields - the read selected field of orc format.
      conjunctPredicates - the filter predicates that can be evaluated.
      batchSize - the batch size of orc reader.
  • Method Details