Package org.apache.flink.formats.parquet
Class ParquetVectorizedInputFormat<T,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>
java.lang.Object
org.apache.flink.formats.parquet.ParquetVectorizedInputFormat<T,SplitT>
- All Implemented Interfaces:
Serializable,org.apache.flink.api.java.typeutils.ResultTypeQueryable<T>,org.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT>
- Direct Known Subclasses:
ParquetColumnarRowInputFormat
public abstract class ParquetVectorizedInputFormat<T,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>
extends Object
implements org.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT>
Parquet
BulkFormat that reads data from the file to VectorizedColumnBatch in
vectorized mode.- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static classReader batch that provides writing and reading capabilities. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final SerializableConfigurationprotected final boolean -
Constructor Summary
ConstructorsConstructorDescriptionParquetVectorizedInputFormat(SerializableConfiguration hadoopConfig, org.apache.flink.table.types.logical.RowType projectedType, ColumnBatchFactory<SplitT> batchFactory, int batchSize, boolean isUtcTimestamp, boolean isCaseSensitive) -
Method Summary
Modifier and TypeMethodDescriptionParquetVectorizedInputFormat<T,SplitT>.org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.ParquetReader createReader(org.apache.flink.configuration.Configuration config, SplitT split) protected abstract ParquetVectorizedInputFormat.ParquetReaderBatch<T>createReaderBatch(org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector[] writableVectors, org.apache.flink.table.data.columnar.vector.VectorizedColumnBatch columnarBatch, org.apache.flink.connector.file.src.util.Pool.Recycler<ParquetVectorizedInputFormat.ParquetReaderBatch<T>> recycler) booleanprotected intnumBatchesToCirculate(org.apache.flink.configuration.Configuration config) ParquetVectorizedInputFormat<T,SplitT>.org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.ParquetReader restoreReader(org.apache.flink.configuration.Configuration config, SplitT split) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.connector.file.src.reader.BulkFormat
getProducedType
-
Field Details
-
hadoopConfig
-
isUtcTimestamp
protected final boolean isUtcTimestamp
-
-
Constructor Details
-
ParquetVectorizedInputFormat
public ParquetVectorizedInputFormat(SerializableConfiguration hadoopConfig, org.apache.flink.table.types.logical.RowType projectedType, ColumnBatchFactory<SplitT> batchFactory, int batchSize, boolean isUtcTimestamp, boolean isCaseSensitive)
-
-
Method Details
-
createReader
public ParquetVectorizedInputFormat<T,SplitT>.org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.ParquetReader createReader(org.apache.flink.configuration.Configuration config, SplitT split) throws IOException - Specified by:
createReaderin interfaceorg.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit> - Throws:
IOException
-
numBatchesToCirculate
protected int numBatchesToCirculate(org.apache.flink.configuration.Configuration config) -
restoreReader
public ParquetVectorizedInputFormat<T,SplitT>.org.apache.flink.formats.parquet.ParquetVectorizedInputFormat.ParquetReader restoreReader(org.apache.flink.configuration.Configuration config, SplitT split) throws IOException - Specified by:
restoreReaderin interfaceorg.apache.flink.connector.file.src.reader.BulkFormat<T,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit> - Throws:
IOException
-
isSplittable
public boolean isSplittable() -
createReaderBatch
protected abstract ParquetVectorizedInputFormat.ParquetReaderBatch<T> createReaderBatch(org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector[] writableVectors, org.apache.flink.table.data.columnar.vector.VectorizedColumnBatch columnarBatch, org.apache.flink.connector.file.src.util.Pool.Recycler<ParquetVectorizedInputFormat.ParquetReaderBatch<T>> recycler) - Parameters:
writableVectors- vectors to be writecolumnarBatch- vectors to be readrecycler- batch recycler
-