Class CheckpointCommitter
- All Implemented Interfaces:
Serializable
The current checkpointing mechanism is ill-suited for sinks relying on backends that do not support roll-backs. When dealing with such a system, while trying to get exactly-once semantics, one may neither commit data while creating the snapshot (since another sink instance may fail, leading to a replay on the same data) nor when receiving a checkpoint-complete notification (since a subsequent failure would leave us with no knowledge as to whether data was committed or not).
A CheckpointCommitter can be used to solve the second problem by saving whether an instance committed all data belonging to a checkpoint. This data must be stored in a backend that is persistent across retries (which rules out Flink's state mechanism) and accessible from all machines, like a database or distributed file.
There is no mandate as to how the resource is shared; there may be one resource for all Flink jobs, or one for each job/operator/-instance separately. This implies that the resource must not be cleaned up by the system itself, and as such should kept as small as possible.
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract voidclose()Closes the resource/connection to it.abstract voidcommitCheckpoint(int subtaskIdx, long checkpointID) Mark the given checkpoint as completed in the resource.abstract voidCreates/opens/connects to the resource that is used to store information.abstract booleanisCheckpointCommitted(int subtaskIdx, long checkpointID) Checked the resource whether the given checkpoint was committed completely.abstract voidopen()Opens/connects to the resource, and possibly creates it beforehand.voidInternally used to set the job ID after instantiation.voidsetOperatorId(String id) Internally used to set the operator ID after instantiation.
-
Field Details
-
LOG
protected static final org.slf4j.Logger LOG -
jobId
-
operatorId
-
-
Constructor Details
-
CheckpointCommitter
public CheckpointCommitter()
-
-
Method Details
-
setJobId
Internally used to set the job ID after instantiation.- Parameters:
id-- Throws:
Exception
-
setOperatorId
Internally used to set the operator ID after instantiation.- Parameters:
id-- Throws:
Exception
-
open
Opens/connects to the resource, and possibly creates it beforehand.- Throws:
Exception
-
close
Closes the resource/connection to it. The resource should generally still exist after this call.- Throws:
Exception
-
createResource
Creates/opens/connects to the resource that is used to store information. Called once directly after instantiation.- Throws:
Exception
-
commitCheckpoint
Mark the given checkpoint as completed in the resource.- Parameters:
subtaskIdx- the index of the subtask responsible for committing the checkpoint.checkpointID- the id of the checkpoint to be committed.- Throws:
Exception
-
isCheckpointCommitted
Checked the resource whether the given checkpoint was committed completely.- Parameters:
subtaskIdx- the index of the subtask responsible for committing the checkpoint.checkpointID- the id of the checkpoint we are interested in.- Returns:
- true if the checkpoint was committed completely, false otherwise
- Throws:
Exception
-