Class CheckpointCommitter

java.lang.Object
org.apache.flink.streaming.runtime.operators.CheckpointCommitter
All Implemented Interfaces:
Serializable

public abstract class CheckpointCommitter extends Object implements Serializable
This class is used to save information about which sink operator instance has committed checkpoints to a backend.

The current checkpointing mechanism is ill-suited for sinks relying on backends that do not support roll-backs. When dealing with such a system, while trying to get exactly-once semantics, one may neither commit data while creating the snapshot (since another sink instance may fail, leading to a replay on the same data) nor when receiving a checkpoint-complete notification (since a subsequent failure would leave us with no knowledge as to whether data was committed or not).

A CheckpointCommitter can be used to solve the second problem by saving whether an instance committed all data belonging to a checkpoint. This data must be stored in a backend that is persistent across retries (which rules out Flink's state mechanism) and accessible from all machines, like a database or distributed file.

There is no mandate as to how the resource is shared; there may be one resource for all Flink jobs, or one for each job/operator/-instance separately. This implies that the resource must not be cleaned up by the system itself, and as such should kept as small as possible.

See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected String
     
    protected static final org.slf4j.Logger
     
    protected String
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    abstract void
    Closes the resource/connection to it.
    abstract void
    commitCheckpoint(int subtaskIdx, long checkpointID)
    Mark the given checkpoint as completed in the resource.
    abstract void
    Creates/opens/connects to the resource that is used to store information.
    abstract boolean
    isCheckpointCommitted(int subtaskIdx, long checkpointID)
    Checked the resource whether the given checkpoint was committed completely.
    abstract void
    Opens/connects to the resource, and possibly creates it beforehand.
    void
    Internally used to set the job ID after instantiation.
    void
    Internally used to set the operator ID after instantiation.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • LOG

      protected static final org.slf4j.Logger LOG
    • jobId

      protected String jobId
    • operatorId

      protected String operatorId
  • Constructor Details

    • CheckpointCommitter

      public CheckpointCommitter()
  • Method Details

    • setJobId

      public void setJobId(String id) throws Exception
      Internally used to set the job ID after instantiation.
      Parameters:
      id -
      Throws:
      Exception
    • setOperatorId

      public void setOperatorId(String id) throws Exception
      Internally used to set the operator ID after instantiation.
      Parameters:
      id -
      Throws:
      Exception
    • open

      public abstract void open() throws Exception
      Opens/connects to the resource, and possibly creates it beforehand.
      Throws:
      Exception
    • close

      public abstract void close() throws Exception
      Closes the resource/connection to it. The resource should generally still exist after this call.
      Throws:
      Exception
    • createResource

      public abstract void createResource() throws Exception
      Creates/opens/connects to the resource that is used to store information. Called once directly after instantiation.
      Throws:
      Exception
    • commitCheckpoint

      public abstract void commitCheckpoint(int subtaskIdx, long checkpointID) throws Exception
      Mark the given checkpoint as completed in the resource.
      Parameters:
      subtaskIdx - the index of the subtask responsible for committing the checkpoint.
      checkpointID - the id of the checkpoint to be committed.
      Throws:
      Exception
    • isCheckpointCommitted

      public abstract boolean isCheckpointCommitted(int subtaskIdx, long checkpointID) throws Exception
      Checked the resource whether the given checkpoint was committed completely.
      Parameters:
      subtaskIdx - the index of the subtask responsible for committing the checkpoint.
      checkpointID - the id of the checkpoint we are interested in.
      Returns:
      true if the checkpoint was committed completely, false otherwise
      Throws:
      Exception