Class AbstractFileSource.AbstractFileSourceBuilder<T,SplitT extends FileSourceSplit,SELF extends AbstractFileSource.AbstractFileSourceBuilder<T,SplitT,SELF>>

java.lang.Object
org.apache.flink.connector.file.src.AbstractFileSource.AbstractFileSourceBuilder<T,SplitT,SELF>
Direct Known Subclasses:
FileSource.FileSourceBuilder
Enclosing class:
AbstractFileSource<T,SplitT extends FileSourceSplit>

protected abstract static class AbstractFileSource.AbstractFileSourceBuilder<T,SplitT extends FileSourceSplit,SELF extends AbstractFileSource.AbstractFileSourceBuilder<T,SplitT,SELF>> extends Object
The generic base builder. This builder carries a SELF type to make it convenient to extend this for subclasses, using the following pattern.

 public class SubBuilder<T> extends AbstractFileSourceBuilder<T, SubBuilder<T>> {
     ...
 }
 

That way, all return values from builder method defined here are typed to the sub-class type and support fluent chaining.

We don't make the publicly visible builder generic with a SELF type, because it leads to generic signatures that can look complicated and confusing.

  • Field Details

  • Constructor Details

  • Method Details

    • build

      public abstract AbstractFileSource<T,SplitT> build()
      Creates the file source with the settings applied to this builder.
    • monitorContinuously

      public SELF monitorContinuously(Duration discoveryInterval)
      Sets this source to streaming ("continuous monitoring") mode.

      This makes the source a "continuous streaming" source that keeps running, monitoring for new files, and reads these files when they appear and are discovered by the monitoring.

      The interval in which the source checks for new files is the discoveryInterval. Shorter intervals mean that files are discovered more quickly, but also imply more frequent listing or directory traversal of the file system / object store.

    • processStaticFileSet

      public SELF processStaticFileSet()
      Sets this source to bounded (batch) mode.

      In this mode, the source processes the files that are under the given paths when the application is started. Once all files are processed, the source will finish.

      This setting is also the default behavior. This method is mainly here to "switch back" to bounded (batch) mode, or to make it explicit in the source construction.

    • setFileEnumerator

      public SELF setFileEnumerator(FileEnumerator.Provider fileEnumerator)
      Configures the FileEnumerator for the source. The File Enumerator is responsible for selecting from the input path the set of files that should be processed (and which to filter out). Furthermore, the File Enumerator may split the files further into sub-regions, to enable parallelization beyond the number of files.
    • setSplitAssigner

      public SELF setSplitAssigner(FileSplitAssigner.Provider splitAssigner)
      Configures the FileSplitAssigner for the source. The File Split Assigner determines which parallel reader instance gets which FileSourceSplit, and in which order these splits are assigned.