Class AvroParquetReaders

java.lang.Object
org.apache.flink.formats.parquet.avro.AvroParquetReaders

@Experimental public class AvroParquetReaders extends Object
A convenience builder to create AvroParquetRecordFormat instances for the different kinds of Avro record types.
  • Method Summary

    Modifier and Type
    Method
    Description
    static org.apache.flink.connector.file.src.reader.StreamFormat<org.apache.avro.generic.GenericRecord>
    forGenericRecord(org.apache.avro.Schema schema)
    Creates a new AvroParquetRecordFormat that reads the parquet file into Avro GenericRecords.
    static <T> org.apache.flink.connector.file.src.reader.StreamFormat<T>
    forReflectRecord(Class<T> typeClass)
    Creates a new AvroParquetRecordFormat that reads the parquet file into Avro records via reflection.
    static <T extends org.apache.avro.specific.SpecificRecordBase>
    org.apache.flink.connector.file.src.reader.StreamFormat<T>
    forSpecificRecord(Class<T> typeClass)
    Creates a new AvroParquetRecordFormat that reads the parquet file into Avro SpecificRecords.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • forSpecificRecord

      public static <T extends org.apache.avro.specific.SpecificRecordBase> org.apache.flink.connector.file.src.reader.StreamFormat<T> forSpecificRecord(Class<T> typeClass)
      Creates a new AvroParquetRecordFormat that reads the parquet file into Avro SpecificRecords.

      To read into Avro GenericRecords, use the forGenericRecord(Schema) method.

      See Also:
    • forReflectRecord

      public static <T> org.apache.flink.connector.file.src.reader.StreamFormat<T> forReflectRecord(Class<T> typeClass)
      Creates a new AvroParquetRecordFormat that reads the parquet file into Avro records via reflection.

      To read into Avro GenericRecords, use the forGenericRecord(Schema) method.

      To read into Avro SpecificRecords, use the forSpecificRecord(Class) method.

      See Also:
    • forGenericRecord

      public static org.apache.flink.connector.file.src.reader.StreamFormat<org.apache.avro.generic.GenericRecord> forGenericRecord(org.apache.avro.Schema schema)
      Creates a new AvroParquetRecordFormat that reads the parquet file into Avro GenericRecords.

      To read into GenericRecords, this method needs an Avro Schema. That is because Flink needs to be able to serialize the results in its data flow, which is very inefficient without the schema. And while the Schema is stored in the Avro file header, Flink needs this schema during 'pre-flight' time when the data flow is set up and wired, which is before there is access to the files.