Class FileSystemCommitter

java.lang.Object
org.apache.flink.connector.file.table.FileSystemCommitter

@Internal public class FileSystemCommitter extends Object
File system file committer implementation. It moves all files to output path from temporary path.

It's used to commit data to FileSystem table in batch mode.

Data consistency: 1.For task failure: will launch a new task and create a PartitionTempFileManager, this will clean previous temporary files (This simple design can make it easy to delete the invalid temporary directory of the task, but it also causes that our directory does not support the same task to start multiple backups to run). 2.For job master commit failure when overwrite: this may result in unfinished intermediate results, but if we try to run job again, the final result must be correct (because the intermediate result will be overwritten). 3.For job master commit failure when append: This can lead to inconsistent data. But, considering that the commit action is a single point of execution, and only moves files and updates metadata, it will be faster, so the probability of inconsistency is relatively small.

See: PartitionTempFileManager. PartitionLoader.

  • Constructor Details

  • Method Details

    • commitPartitions

      public void commitPartitions() throws Exception
      For committing job's output after successful batch job completion.
      Throws:
      Exception
    • commitPartitions

      public void commitPartitions(BiPredicate<Integer,Integer> taskAttemptFilter) throws Exception
      Commits the partitions with a filter to filter out invalid task attempt files. In speculative execution mode, there might be some files which do not belong to the finished attempt.
      Parameters:
      taskAttemptFilter - the filter that accepts subtaskIndex and attemptNumber
      Throws:
      Exception - if partition commitment fails
    • commitPartitionsWithFiles

      public void commitPartitionsWithFiles(Map<String,List<org.apache.flink.core.fs.Path>> partitionsFiles) throws Exception
      For committing job's output after successful batch job completion, it will commit with the given partitions and corresponding files written which means it'll move the temporary files to partition's location.
      Throws:
      Exception