Class AvroParquetReaders
java.lang.Object
org.apache.flink.formats.parquet.avro.AvroParquetReaders
A convenience builder to create
AvroParquetRecordFormat instances for the different kinds
of Avro record types.-
Method Summary
Modifier and TypeMethodDescriptionstatic org.apache.flink.connector.file.src.reader.StreamFormat<org.apache.avro.generic.GenericRecord>forGenericRecord(org.apache.avro.Schema schema) Creates a newAvroParquetRecordFormatthat reads the parquet file into AvroGenericRecords.static <T> org.apache.flink.connector.file.src.reader.StreamFormat<T>forReflectRecord(Class<T> typeClass) Creates a newAvroParquetRecordFormatthat reads the parquet file into Avro records via reflection.static <T extends org.apache.avro.specific.SpecificRecordBase>
org.apache.flink.connector.file.src.reader.StreamFormat<T>forSpecificRecord(Class<T> typeClass) Creates a newAvroParquetRecordFormatthat reads the parquet file into AvroSpecificRecords.
-
Method Details
-
forSpecificRecord
public static <T extends org.apache.avro.specific.SpecificRecordBase> org.apache.flink.connector.file.src.reader.StreamFormat<T> forSpecificRecord(Class<T> typeClass) Creates a newAvroParquetRecordFormatthat reads the parquet file into AvroSpecificRecords.To read into Avro
GenericRecords, use theforGenericRecord(Schema)method.- See Also:
-
forReflectRecord
public static <T> org.apache.flink.connector.file.src.reader.StreamFormat<T> forReflectRecord(Class<T> typeClass) Creates a newAvroParquetRecordFormatthat reads the parquet file into Avro records via reflection.To read into Avro
GenericRecords, use theforGenericRecord(Schema)method.To read into Avro
SpecificRecords, use theforSpecificRecord(Class)method. -
forGenericRecord
public static org.apache.flink.connector.file.src.reader.StreamFormat<org.apache.avro.generic.GenericRecord> forGenericRecord(org.apache.avro.Schema schema) Creates a newAvroParquetRecordFormatthat reads the parquet file into AvroGenericRecords.To read into
GenericRecords, this method needs an Avro Schema. That is because Flink needs to be able to serialize the results in its data flow, which is very inefficient without the schema. And while the Schema is stored in the Avro file header, Flink needs this schema during 'pre-flight' time when the data flow is set up and wired, which is before there is access to the files.
-