Interface FileEnumerator
- All Known Subinterfaces:
DynamicFileEnumerator
- All Known Implementing Classes:
BlockSplittingRecursiveAllDirEnumerator,BlockSplittingRecursiveEnumerator,NonSplittingRecursiveAllDirEnumerator,NonSplittingRecursiveEnumerator
@PublicEvolving
public interface FileEnumerator
The
FileEnumerator's task is to discover all files to be read and to split them into a
set of FileSourceSplit.
This includes possibly, path traversals, file filtering (by name or other patterns) and deciding whether to split files into multiple splits, and how to split them.
-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic interfaceFactory for theFileEnumerator, to allow theFileEnumeratorto be eagerly initialized and to not be serializable. -
Method Summary
Modifier and TypeMethodDescriptionenumerateSplits(org.apache.flink.core.fs.Path[] paths, int minDesiredSplits) Generates all file splits for the relevant files under the given paths.
-
Method Details
-
enumerateSplits
Collection<FileSourceSplit> enumerateSplits(org.apache.flink.core.fs.Path[] paths, int minDesiredSplits) throws IOException Generates all file splits for the relevant files under the given paths. TheminDesiredSplitsis an optional hint indicating how many splits would be necessary to exploit parallelism properly.- Throws:
IOException
-