All Classes and Interfaces
Class
Description
The base class for File Sources.
AbstractFileSource.AbstractFileSourceBuilder<T,SplitT extends FileSourceSplit,SELF extends AbstractFileSource.AbstractFileSourceBuilder<T,SplitT,SELF>>
The generic base builder.
Operator for file system sink.
A simple
BulkFormat.RecordIterator that returns the elements of an array, one after the
other.Coordinator for compaction in batch mode.
CompactOperator for compaction in batch mode.
An operator for writing files in batch mode.
Committer operator for partition in batch mode.
Helper for creating batch file sink.
A bin packing implementation.
This
FileEnumerator enumerates all files under the given paths recursively except the
hidden directories, and creates a separate split for each file block.This
FileEnumerator enumerates all files under the given paths recursively, and creates a
separate split for each file block.The
BulkFormat reads and decodes batches of records at a time.The actual reader that reads the batches of records.
An iterator over records with their position in the file.
Base interface for configuring a
BulkFormat for file system connector.Base interface for configuring a
BulkWriter.Factory for file system connector.The position of a reader, to be stored in a checkpoint.
A
BulkFormat.RecordIterator that returns RowDatas.The
CompactWriter to delegate BucketWriter.The
CompactReader to delegate CompactBulkReader.Context for
CompactReader and CompactWriter.Implementation of
CompactContext.Coordinator that coordinates file compaction for the
FileSink.This is the single (non-parallel) monitoring task which coordinate input files to compaction
units. - Receives in-flight input files inside checkpoint. - Receives all upstream end input
messages after the checkpoint completes successfully, starts coordination.
Factory for
CompactCoordinator.Handler that processes the state of
CompactCoordinator when compaction is disabled.Factory for
CompactCoordinatorStateHandler.Utils for compacting files.
Writer for emitting
CompactMessages.InputFile and CompactMessages.EndCheckpoint to downstream.Util class for all compaction messages.
The unit of a single compaction.
The output of
BatchCompactOperator.The input of compact coordinator.
The output of compact coordinator.
A flag to end checkpoint, coordinator can start coordinating one checkpoint.
A flag to end compaction.
A partitioned input file.
Receives compaction units to do compaction.
An operator that perform compaction for the
FileSink.Factory for
CompactorOperator.Handler that processes the state of
CompactorOperator when compaction is disabled.Factory for
CompactorOperatorStateHandler.Request of file compacting for
FileSink.Versioned serializer for
CompactorRequest.Type information of
CompactorRequest.Reader for compaction.
Factory to create
CompactReader.The asynchronous file compaction service.
Writer for compaction.
Factory to create
CompactWriter.A
OutputStreamBasedFileCompactor implementation that simply concat the compacting files.Settings describing how to do continuous file discovery and enumeration for the file source's
continuous discovery and streaming mode.
A continuously monitoring enumerator.
Partition fetcher for helping continuously fetch partitioned table.
Context for fetch partitions, partition information is stored in hive meta store.
A
RecordWiseFileCompactor.Reader implementation that reads the file as an FSDataInputStream and decodes the record with the DecoderBasedReader.Decoder.A
DecoderBasedReader.Decoder to decode the file content into the actual records.Factory to create
DecoderBasedReader.Decoder.Factory for
DecoderBasedReader.A file filter that filters out hidden files based on common naming patterns, i.e., files where
the filename starts with '.' or with '_'.
A factory returning
writer.Default
PartitionTimeExtractor.Adapter to turn a
DeserializationSchema into a BulkFormat.FileEnumerator that supports dynamic filtering.Factory for the
DynamicFileEnumerator.A SplitEnumerator implementation that supports dynamic filtering.
Dynamic partition writer to writing multiple partitions at the same time, it maybe consumes more
memory.
Empty implementation
TableMetaStoreFactory.An implementation of
RowData which is backed by two RowData with a well-defined
index mapping, One of the rows is fixed, while the other can be swapped for performant changes in
hot code paths.Committer implementation for
FileSink.The
FileCompactor is responsible for compacting files into one file.Strategy for compacting the files written in
FileSink before committing.Builder for
FileCompactStrategy.The
FileEnumerator's task is to discover all files to be read and to split them into a
set of FileSourceSplit.Factory for the
FileEnumerator, to allow the FileEnumerator to be eagerly
initialized and to not be serializable.The
CompactReader to delegate FileInputFormat.A collection of records for one file split.
A unified sink that emits its input elements to
FileSystem files within buckets.A builder for configuring the sink for bulk-encoding formats, e.g.
Builder for the vanilla
FileSink using a bulk format.Builder for the vanilla
FileSink using a row format.A builder for configuring the sink for row-wise encoding formats.
Wrapper class for both type of committables in
FileSink.Versioned serializer for
FileSinkCommittable.A unified data source that reads files - both in batch and in streaming mode.
The builder for the
FileSource, to configure the various behaviors.A
SourceReader that read records from FileSourceSplit.A
SourceSplit that represents a file, or a region of a file.A serializer for the
FileSourceSplit.State of the reader, essentially a mutable version of the
FileSourceSplit.The
FileSplitAssigner is responsible for deciding what split should be processed next by
which node.Factory for the
FileSplitAssigner, to allow the FileSplitAssigner to be
eagerly initialized and to not be serializable.File system file committer implementation.
Options for the filesystem connector.
Statistics types for file system, see
FileSystemConnectorOptions.SOURCE_REPORT_STATISTICS.Trigger types for partition commit, see
FileSystemConnectorOptions.SINK_PARTITION_COMMIT_TRIGGER.A factory to create file systems.
File system
OutputFormat for batch job.Builder to build
FileSystemOutputFormat.File system
TableFactory.File system
DynamicTableSink.Project row to non-partition fields.
Table bucket assigner, wrap
PartitionComputer.Table
RollingPolicy, it extends CheckpointRollingPolicy for bulk writers.File system table source.
A
SinkWriter implementation for FileSink.A factory able to create
FileWriterBucket for the FileSink.States for
FileWriterBucket.A
SimpleVersionedSerializer used to serialize the BucketState.PartitionWriter for grouped dynamic partition inserting.A simple
OutputStreamBasedFileCompactor implementation that directly copy the content of
the only input file to the output.A
RecordWiseFileCompactor.Reader implementation that reads the file using the FileInputFormat.Factory for
InputFormatBasedReader.A simple
BulkFormat.RecordIterator that returns the elements of an iterator, augmented
with position information.A
BulkFormat that can limit output record number.A
FileSplitAssigner that assigns to each host preferably splits that are local, before
assigning splits that are not local.Partition commit policy to update metastore.
A mutable version of the
RecordAndPosition.This
FileEnumerator enumerates all files under the given paths recursively except the
hidden directories.This
FileEnumerator enumerates all files under the given paths recursively.A factory to create an
OutputFormat.Base class for
FileCompactor implementations that write the compacting file by a output
stream.The message sent by upstream.
Policy for commit a partition.
Context of policy, including table information and partition information.
A factory to create
PartitionCommitPolicy chain.Partition commit predicate.
Context that
PartitionCommitPredicate can use for getting context about a partition.Committer operator for partitions.
Partition commit trigger.
Compute partition path from record and project non-partition columns for output writer.
Fetcher to fetch the suitable partitions of a filesystem table.
Context for fetch partitions, partition information is stored in hive meta store.
A comparable partition value that can compare order by using its comparator.
Interface to extract partition field from split.
Loader to temporary files to final output path and meta store.
Reader that reads record from given partitions.
Manage temporary files for writing files.
Partition commit predicate by partition time and watermark, if 'watermark' > 'partition-time' +
'delay', the partition is committable.
Partition commit trigger by partition time and watermark.
Time extractor to extract time from partition values.
Partition writer to write records with partition.
Context for partition writer, provide some information and generation utils.
Default implementation for PartitionWriterListener.
Listener for partition writer.
Factory of
PartitionWriter to avoid virtual function calls.A checkpoint of the current state of the containing the currently pending splits that are not yet
assigned.
A serializer for the
PendingSplitsCheckpoint.A pool to cache and recycle heavyweight objects, to reduce object allocation.
A Recycler puts objects into the pool that the recycler is associated with.
Partition commit trigger by creation time and processing time service, if 'current processing
time' > 'partition creation time' + 'delay', the partition is committable.
Partition commit trigger by creation time and processing time service.
A record, together with the reader position to be stored in the checkpoint.
Implementation of
BulkFormat.RecordIterator
that wraps another iterator and performs the mapping of the records.Record mapper definition.
A
FileCompactor implementation that reads input files with a RecordWiseFileCompactor.Reader and writes
with a RecordWiseFileCompactor.Writer.The reader that reads record from the compacting files.
Factory for
RecordWiseFileCompactor.Reader.The writer that writers record into the compacting files.
Utility base class for iterators that accept a recycler.
A file filter that filters out hidden files, see
DefaultFileFilter and the files whose
path doesn't match the given regex pattern.PartitionComputer for RowData.PartitionComputer for Row.Adapter to turn a
SerializationSchema into a Encoder.The
SimpleSplitAssigner hands out splits in a random order, without any consideration for
order or locality.A simple version of the
StreamFormat, for formats that are not splittable.A sink
DecoderBasedReader.Decoder that reads data encoded by the SimpleStringEncoder only for compaction.PartitionWriter for single directory writer.A simple
BulkFormat.RecordIterator that returns a single value.A collection of common compression formats and de-compressors.
A SplitEnumerator implementation for bounded / batch
FileSource input.A reader format that reads individual records from a stream.
The actual reader that reads the records.
Adapter to turn a
StreamFormat into a BulkFormat.The reader adapter, from
StreamFormat.Reader to BulkFormat.Reader.Writer for emitting
PartitionCommitInfo to downstream.Helper for creating streaming file sink.
Partition commit policy to add success file to directory.
Meta store factory to create
TableMetaStoreFactory.TableMetaStore.Meta store to manage the location paths of this table and its partitions.
Track the upstream tasks to determine whether all the upstream data of a checkpoint has been
received.
A reader format that text lines from a file.
The actual reader for the
TextLineInputFormat.Miscellaneous utilities for the file source.