Interface FileEnumerator

All Known Subinterfaces:
DynamicFileEnumerator
All Known Implementing Classes:
BlockSplittingRecursiveAllDirEnumerator, BlockSplittingRecursiveEnumerator, NonSplittingRecursiveAllDirEnumerator, NonSplittingRecursiveEnumerator

@PublicEvolving public interface FileEnumerator
The FileEnumerator's task is to discover all files to be read and to split them into a set of FileSourceSplit.

This includes possibly, path traversals, file filtering (by name or other patterns) and deciding whether to split files into multiple splits, and how to split them.

  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Interface
    Description
    static interface 
    Factory for the FileEnumerator, to allow the FileEnumerator to be eagerly initialized and to not be serializable.
  • Method Summary

    Modifier and Type
    Method
    Description
    enumerateSplits(org.apache.flink.core.fs.Path[] paths, int minDesiredSplits)
    Generates all file splits for the relevant files under the given paths.
  • Method Details

    • enumerateSplits

      Collection<FileSourceSplit> enumerateSplits(org.apache.flink.core.fs.Path[] paths, int minDesiredSplits) throws IOException
      Generates all file splits for the relevant files under the given paths. The minDesiredSplits is an optional hint indicating how many splits would be necessary to exploit parallelism properly.
      Throws:
      IOException