Class AbstractColumnReader<VECTOR extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>
java.lang.Object
org.apache.flink.formats.parquet.vector.reader.AbstractColumnReader<VECTOR>
- All Implemented Interfaces:
ColumnReader<VECTOR>
- Direct Known Subclasses:
BooleanColumnReader,ByteColumnReader,BytesColumnReader,DoubleColumnReader,FixedLenBytesColumnReader,FloatColumnReader,IntColumnReader,LongColumnReader,ShortColumnReader,TimestampColumnReader
public abstract class AbstractColumnReader<VECTOR extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>
extends Object
implements ColumnReader<VECTOR>
Abstract
ColumnReader. See ColumnReaderImpl, part of the code is referred from
Apache Spark and Apache Parquet.-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final org.apache.parquet.column.ColumnDescriptorprotected final org.apache.parquet.column.DictionaryThe dictionary, if this column has dictionary encoding.protected final intMaximum definition level for this column.protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoderRun length decoder for data and dictionary. -
Constructor Summary
ConstructorsConstructorDescriptionAbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.column.page.PageReader pageReader) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidAfter read a page, we may need some initialization.protected voidcheckTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName) protected abstract voidRead batch fromrunLenDecoderanddataInputStream.protected abstract voidreadBatchFromDictionaryIds(int rowId, int num, VECTOR column, org.apache.flink.table.data.columnar.vector.writable.WritableIntVector dictionaryIds) Decode dictionary ids to data.final voidreadToVector(int readNumber, VECTOR vector) Reads `total` values from this columnReader into column.protected booleanSupport lazy dictionary ids decode.
-
Field Details
-
dictionary
protected final org.apache.parquet.column.Dictionary dictionaryThe dictionary, if this column has dictionary encoding. -
maxDefLevel
protected final int maxDefLevelMaximum definition level for this column. -
descriptor
protected final org.apache.parquet.column.ColumnDescriptor descriptor -
runLenDecoder
protected org.apache.flink.formats.parquet.vector.reader.RunLengthDecoder runLenDecoderRun length decoder for data and dictionary.
-
-
Constructor Details
-
AbstractColumnReader
public AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor, org.apache.parquet.column.page.PageReader pageReader) throws IOException - Throws:
IOException
-
-
Method Details
-
checkTypeName
protected void checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName) -
readToVector
Reads `total` values from this columnReader into column.- Specified by:
readToVectorin interfaceColumnReader<VECTOR extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>- Parameters:
readNumber- number to read.vector- vector to write.- Throws:
IOException
-
afterReadPage
protected void afterReadPage()After read a page, we may need some initialization. -
supportLazyDecode
protected boolean supportLazyDecode()Support lazy dictionary ids decode. See more inParquetDictionary. If return false, we will decode all the data first. -
readBatch
Read batch fromrunLenDecoderanddataInputStream. -
readBatchFromDictionaryIds
protected abstract void readBatchFromDictionaryIds(int rowId, int num, VECTOR column, org.apache.flink.table.data.columnar.vector.writable.WritableIntVector dictionaryIds) Decode dictionary ids to data. FromrunLenDecoderanddictionaryIdsDecoder.
-