Class OrcBulkWriterFactory<T>

java.lang.Object
org.apache.flink.orc.writer.OrcBulkWriterFactory<T>
Type Parameters:
T - The type of element to write.
All Implemented Interfaces:
Serializable, org.apache.flink.api.common.serialization.BulkWriter.Factory<T>

@PublicEvolving public class OrcBulkWriterFactory<T> extends Object implements org.apache.flink.api.common.serialization.BulkWriter.Factory<T>
A factory that creates an ORC BulkWriter. The factory takes a user supplied Vectorizer implementation to convert the element into an VectorizedRowBatch.
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    Creates a new OrcBulkWriterFactory using the provided Vectorizer implementation.
    OrcBulkWriterFactory(Vectorizer<T> vectorizer, Properties writerProperties, org.apache.hadoop.conf.Configuration configuration)
    Creates a new OrcBulkWriterFactory using the provided Vectorizer, Hadoop Configuration, ORC writer properties.
    OrcBulkWriterFactory(Vectorizer<T> vectorizer, org.apache.hadoop.conf.Configuration configuration)
    Creates a new OrcBulkWriterFactory using the provided Vectorizer, Hadoop Configuration.
  • Method Summary

    Modifier and Type
    Method
    Description
    org.apache.flink.api.common.serialization.BulkWriter<T>
    create(org.apache.flink.core.fs.FSDataOutputStream out)
     
    protected org.apache.orc.OrcFile.WriterOptions
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • OrcBulkWriterFactory

      public OrcBulkWriterFactory(Vectorizer<T> vectorizer)
      Creates a new OrcBulkWriterFactory using the provided Vectorizer implementation.
      Parameters:
      vectorizer - The vectorizer implementation to convert input record to a VectorizerRowBatch.
    • OrcBulkWriterFactory

      public OrcBulkWriterFactory(Vectorizer<T> vectorizer, org.apache.hadoop.conf.Configuration configuration)
      Creates a new OrcBulkWriterFactory using the provided Vectorizer, Hadoop Configuration.
      Parameters:
      vectorizer - The vectorizer implementation to convert input record to a VectorizerRowBatch.
    • OrcBulkWriterFactory

      public OrcBulkWriterFactory(Vectorizer<T> vectorizer, Properties writerProperties, org.apache.hadoop.conf.Configuration configuration)
      Creates a new OrcBulkWriterFactory using the provided Vectorizer, Hadoop Configuration, ORC writer properties.
      Parameters:
      vectorizer - The vectorizer implementation to convert input record to a VectorizerRowBatch.
      writerProperties - Properties that can be used in ORC WriterOptions.
  • Method Details

    • create

      public org.apache.flink.api.common.serialization.BulkWriter<T> create(org.apache.flink.core.fs.FSDataOutputStream out) throws IOException
      Specified by:
      create in interface org.apache.flink.api.common.serialization.BulkWriter.Factory<T>
      Throws:
      IOException
    • getWriterOptions

      @VisibleForTesting protected org.apache.orc.OrcFile.WriterOptions getWriterOptions()