Class NonSplittingRecursiveEnumerator
java.lang.Object
org.apache.flink.connector.file.src.enumerate.NonSplittingRecursiveEnumerator
- All Implemented Interfaces:
FileEnumerator
- Direct Known Subclasses:
BlockSplittingRecursiveEnumerator,NonSplittingRecursiveAllDirEnumerator
@PublicEvolving
public class NonSplittingRecursiveEnumerator
extends Object
implements FileEnumerator
This
FileEnumerator enumerates all files under the given paths recursively. Each file
becomes one split; this enumerator does not split files into smaller "block" units.
The default instantiation of this enumerator filters files with the common hidden file prefixes '.' and '_'. A custom file filter can be specified.
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.flink.connector.file.src.enumerate.FileEnumerator
FileEnumerator.Provider -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final Predicate<org.apache.flink.core.fs.Path>The filter predicate to filter out unwanted files. -
Constructor Summary
ConstructorsConstructorDescriptionCreates a NonSplittingRecursiveEnumerator that enumerates all files except hidden files.NonSplittingRecursiveEnumerator(Predicate<org.apache.flink.core.fs.Path> fileFilter) Creates a NonSplittingRecursiveEnumerator that uses the given predicate as a filter for file paths. -
Method Summary
Modifier and TypeMethodDescriptionprotected voidaddSplitsForPath(org.apache.flink.core.fs.FileStatus fileStatus, org.apache.flink.core.fs.FileSystem fs, ArrayList<FileSourceSplit> target) protected voidconvertToSourceSplits(org.apache.flink.core.fs.FileStatus file, org.apache.flink.core.fs.FileSystem fs, List<FileSourceSplit> target) enumerateSplits(org.apache.flink.core.fs.Path[] paths, int minDesiredSplits) Generates all file splits for the relevant files under the given paths.protected final String
-
Field Details
-
fileFilter
The filter predicate to filter out unwanted files.
-
-
Constructor Details
-
NonSplittingRecursiveEnumerator
public NonSplittingRecursiveEnumerator()Creates a NonSplittingRecursiveEnumerator that enumerates all files except hidden files. Hidden files are considered files where the filename starts with '.' or with '_'. -
NonSplittingRecursiveEnumerator
Creates a NonSplittingRecursiveEnumerator that uses the given predicate as a filter for file paths.
-
-
Method Details
-
enumerateSplits
public Collection<FileSourceSplit> enumerateSplits(org.apache.flink.core.fs.Path[] paths, int minDesiredSplits) throws IOException Description copied from interface:FileEnumeratorGenerates all file splits for the relevant files under the given paths. TheminDesiredSplitsis an optional hint indicating how many splits would be necessary to exploit parallelism properly.- Specified by:
enumerateSplitsin interfaceFileEnumerator- Throws:
IOException
-
addSplitsForPath
protected void addSplitsForPath(org.apache.flink.core.fs.FileStatus fileStatus, org.apache.flink.core.fs.FileSystem fs, ArrayList<FileSourceSplit> target) throws IOException - Throws:
IOException
-
convertToSourceSplits
protected void convertToSourceSplits(org.apache.flink.core.fs.FileStatus file, org.apache.flink.core.fs.FileSystem fs, List<FileSourceSplit> target) throws IOException - Throws:
IOException
-
getNextId
-