Package org.apache.flink.orc.vector
Class Vectorizer<T>
java.lang.Object
org.apache.flink.orc.vector.Vectorizer<T>
- Type Parameters:
T- The type of the element
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
RowDataVectorizer
This class provides an abstracted set of methods to handle the lifecycle of
VectorizedRowBatch.
Users have to extend this class and override the vectorize() method with the logic to
transform the element to a VectorizedRowBatch.
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidaddUserMetadata(String key, ByteBuffer value) Adds arbitrary user metadata to the outgoing ORC file.org.apache.orc.TypeDescriptionProvides the ORC schema.voidsetWriter(org.apache.orc.Writer writer) Users are not supposed to use this method since this is intended to be used only by theOrcBulkWriter.abstract voidTransforms the provided element to ColumnVectors and sets them in the exposed VectorizedRowBatch.
-
Constructor Details
-
Vectorizer
-
-
Method Details
-
getSchema
public org.apache.orc.TypeDescription getSchema()Provides the ORC schema.- Returns:
- the ORC schema
-
setWriter
public void setWriter(org.apache.orc.Writer writer) Users are not supposed to use this method since this is intended to be used only by theOrcBulkWriter.- Parameters:
writer- the underlying ORC Writer.
-
addUserMetadata
Adds arbitrary user metadata to the outgoing ORC file.Users who want to dynamically add new metadata either based on either the input or from an external system can do so by calling
addUserMetadata(...)inside the overridden vectorize() method.- Parameters:
key- a key to label the data with.value- the contents of the metadata.
-
vectorize
public abstract void vectorize(T element, org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch batch) throws IOException Transforms the provided element to ColumnVectors and sets them in the exposed VectorizedRowBatch.- Parameters:
element- The input elementbatch- The batch to write the ColumnVectors- Throws:
IOException- if there is an error while transforming the input.
-