Class GlobalCommitterOperator<CommT,GlobalCommT>
- All Implemented Interfaces:
Serializable,org.apache.flink.api.common.state.CheckpointListener,Input<CommittableMessage<CommT>>,KeyContext,KeyContextHandler,OneInputStreamOperator<CommittableMessage<CommT>,,Void> StreamOperator<Void>,StreamOperatorStateHandler.CheckpointedStreamOperator,YieldingOperator<Void>
GlobalCommitter.
This operator usually trails behind a CommitterOperator. In this case, the global
committer will receive committables from the committer operator through processElement(StreamRecord). Once all committables from all subtasks have been received, the
global committer will commit them. This approach also works for any number of intermediate custom
operators between the committer and the global committer in a custom post-commit topology.
That means that the global committer will not wait for notifyCheckpointComplete(long). In many cases, it receives the callback before the actual
committables anyway. So it would effectively globally commit one checkpoint later.
However, we can leverage the following observation: the global committer will only receive
committables iff the respective checkpoint was completed and upstream committers received the
notifyCheckpointComplete(long). So by waiting for all committables of a given
checkpoint, we implicitly know that the checkpoint was successful and the global committer is
supposed to globally commit.
Note that committables of checkpoint X are not checkpointed in X because the global committer
is trailing behind the checkpoint. They are replayed from the committer state in case of an
error. The state only includes incomplete checkpoints coming from upstream committers not
receiving notifyCheckpointComplete(long). All committables received are successful.
In rare cases, the GlobalCommitterOperator may not be connected (in)directly to a committer
but instead is connected (in)directly to a writer. In this case, the global committer needs to
perform the 2PC protocol instead of the committer. Thus, we absolutely need to use notifyCheckpointComplete(long) similarly to the CommitterOperator. Hence, commitOnInput is set to false in this case. In particular, the following three prerequisites
must be met:
- No committer is upstream of which we could implicitly infer
notifyCheckpointComplete(long)as sketched above. - The application runs in streaming mode.
- Checkpointing is enabled.
In all other cases (batch or upstream committer or checkpointing is disabled), the global committer commits on input.
- See Also:
-
Field Summary
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
combinedWatermark, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager -
Constructor Summary
ConstructorsConstructorDescriptionGlobalCommitterOperator(org.apache.flink.util.function.SerializableFunction<org.apache.flink.api.connector.sink2.CommitterInitContext, org.apache.flink.api.connector.sink2.Committer<CommT>> committerFactory, org.apache.flink.util.function.SerializableSupplier<org.apache.flink.core.io.SimpleVersionedSerializer<CommT>> committableSerializerFactory, boolean commitOnInput) -
Method Summary
Modifier and TypeMethodDescriptionvoidStream operators with state which can be restored need to override this hook method.voidnotifyCheckpointComplete(long checkpointId) voidprocessElement(StreamRecord<CommittableMessage<CommT>> element) Processes one element that arrived on this input of theMultipleInputStreamOperator.protected voidsetup(StreamTask<?, ?> containingTask, StreamConfig config, Output<StreamRecord<Void>> output) voidsnapshotState(StateSnapshotContext context) Stream operators with state, which want to participate in a snapshot need to override this hook method.Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
beforeInitializeStateHandler, close, finish, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isAsyncStateProcessingEnabled, isUsingCustomRawKeyedState, notifyCheckpointAborted, open, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark, processWatermark1, processWatermark1, processWatermark2, processWatermark2, processWatermarkStatus, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, snapshotState, useSplittableTimersMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAbortedMethods inherited from interface org.apache.flink.streaming.api.operators.Input
processLatencyMarker, processRecordAttributes, processWatermark, processWatermark, processWatermarkStatusMethods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKeyMethods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContextMethods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator
setKeyContextElementMethods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
close, finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, open, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
-
Constructor Details
-
GlobalCommitterOperator
public GlobalCommitterOperator(org.apache.flink.util.function.SerializableFunction<org.apache.flink.api.connector.sink2.CommitterInitContext, org.apache.flink.api.connector.sink2.Committer<CommT>> committerFactory, org.apache.flink.util.function.SerializableSupplier<org.apache.flink.core.io.SimpleVersionedSerializer<CommT>> committableSerializerFactory, boolean commitOnInput)
-
-
Method Details
-
setup
protected void setup(StreamTask<?, ?> containingTask, StreamConfig config, Output<StreamRecord<Void>> output) - Overrides:
setupin classAbstractStreamOperator<Void>
-
snapshotState
Description copied from class:AbstractStreamOperatorStream operators with state, which want to participate in a snapshot need to override this hook method.- Specified by:
snapshotStatein interfaceStreamOperatorStateHandler.CheckpointedStreamOperator- Overrides:
snapshotStatein classAbstractStreamOperator<Void>- Parameters:
context- context that provides information and means required for taking a snapshot- Throws:
Exception
-
initializeState
Description copied from class:AbstractStreamOperatorStream operators with state which can be restored need to override this hook method.- Specified by:
initializeStatein interfaceStreamOperatorStateHandler.CheckpointedStreamOperator- Overrides:
initializeStatein classAbstractStreamOperator<Void>- Parameters:
context- context that allows to register different states.- Throws:
Exception
-
notifyCheckpointComplete
- Specified by:
notifyCheckpointCompletein interfaceorg.apache.flink.api.common.state.CheckpointListener- Overrides:
notifyCheckpointCompletein classAbstractStreamOperator<Void>- Throws:
Exception
-
processElement
Description copied from interface:InputProcesses one element that arrived on this input of theMultipleInputStreamOperator. This method is guaranteed to not be called concurrently with other methods of the operator.- Specified by:
processElementin interfaceInput<CommT>- Throws:
Exception
-