Class StreamTask<OUT,OP extends StreamOperator<OUT>>
- Type Parameters:
OUT-OP-
- All Implemented Interfaces:
CheckpointableTask,CoordinatedTask,TaskInvokable,AsyncExceptionHandler,ContainingTaskDetails
- Direct Known Subclasses:
AbstractTwoInputStreamTask,MultipleInputStreamTask,OneInputStreamTask,SourceOperatorStreamTask,SourceStreamTask
StreamOperators which form the
Task's operator chain. Operators that are chained together execute synchronously in the same
thread and hence on the same stream partition. A common case for these chains are successive
map/flatmap/filter tasks.
The task chain contains one "head" operator and multiple chained operators. The StreamTask is specialized for the type of the head operator: one-input and two-input tasks, as well as for sources, iteration heads and iteration tails.
The Task class deals with the setup of the streams read by the head operator, and the streams produced by the operators at the ends of the operator chain. Note that the chain may fork and thus have multiple ends.
The life cycle of the task is set up as follows:
-- setInitialState -> provides state of all operators in the chain
-- invoke()
|
+----> Create basic utils (config, etc) and load the chain of operators
+----> operators.setup()
+----> task specific init()
+----> initialize-operator-states()
+----> open-operators()
+----> run()
+----> finish-operators()
+----> close-operators()
+----> common cleanup
+----> task specific cleanup()
The StreamTask has a lock object called lock. All calls to methods on a
StreamOperator must be synchronized on this lock object to ensure that no methods are called
concurrently.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceCheck whether records can be emitted in batch. -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final CheckpointStorageOur checkpoint storage.protected final StreamConfigThe configuration of this streaming task.protected StreamInputProcessorThe input processor.protected static final org.slf4j.LoggerThe logger used by the StreamTask and its subclasses.protected final MailboxProcessorprotected OPthe main operator that consumes the input streams of this task.protected OperatorChain<OUT,OP> The chain of operators executed by this task.protected final RecordWriterDelegate<SerializationDelegate<StreamRecord<OUT>>>protected final StateBackendOur state backend.protected final TimerServiceIn contrast totimerServicewe should not register any user timers here.protected final TimerServiceThe internalTimerServiceused to define the current processing time (default =System.currentTimeMillis()) and register timers for tasks to be executed in the future.static final ThreadGroupThe thread group that holds all trigger timer threads. -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedStreamTask(Environment env) Constructor for initialization, possibly with initial state (recovery / savepoint / etc).protectedStreamTask(Environment env, TimerService timerService) Constructor for initialization, possibly with initial state (recovery / savepoint / etc).protectedStreamTask(Environment environment, TimerService timerService, Thread.UncaughtExceptionHandler uncaughtExceptionHandler) protectedStreamTask(Environment environment, TimerService timerService, Thread.UncaughtExceptionHandler uncaughtExceptionHandler, StreamTaskActionExecutor actionExecutor) Constructor for initialization, possibly with initial state (recovery / savepoint / etc).protectedStreamTask(Environment environment, TimerService timerService, Thread.UncaughtExceptionHandler uncaughtExceptionHandler, StreamTaskActionExecutor actionExecutor, TaskMailbox mailbox) -
Method Summary
Modifier and TypeMethodDescriptionvoidabortCheckpointOnBarrier(long checkpointId, CheckpointException cause) Aborts a checkpoint as the result of receiving possibly some checkpoint barriers, but at least oneCancelCheckpointMarker.protected voidEmits theMAX_WATERMARKso that all registered timers are fired.protected voidfinal voidcancel()This method is called when a task is canceled either as a result of a user abort or an execution failure.protected voidfinal voidCleanup any resources used inTaskInvokable.invoke()ORTaskInvokable.restore().protected voidstatic <OUT> RecordWriterDelegate<SerializationDelegate<StreamRecord<OUT>>>createRecordWriterDelegate(StreamConfig configuration, Environment environment) createStreamTaskStateInitializer(SubTaskInitializationMetricsBuilder initializationMetrics) protected voiddeclineCheckpoint(long checkpointId) voiddispatchOperatorEvent(OperatorID operator, org.apache.flink.util.SerializedValue<OperatorEvent> event) protected voidprotected voidfinalize()The finalize method shuts down the timer.protected longfinal ExecutorServicefinal org.apache.flink.core.fs.CloseableRegistryprotected Optional<CheckpointBarrierHandler>Acquires the optionalCheckpointBarrierHandlerassociated with this stream task.protected CompletableFuture<Void>final Environmentfinal StringgetName()Gets the name of the task, in the form "taskname (2/5)".voidhandleAsyncException(String message, Throwable exception) Handles an exception thrown by another thread (e.g. a TriggerTask), other than the one executing the main task by failing the task entirely.booleanhasMail()protected abstract voidinit()final voidinvoke()Starts the execution.final booleanfinal booleanbooleanfinal booleanbooleanvoidmaybeInterruptOnCancel(Thread toInterrupt, String taskName, Long timeout) Checks whether the task should be interrupted during cancellation and if so, execute the specifiedRunnable interruptAction.notifyCheckpointAbortAsync(long checkpointId, long latestCompletedCheckpointId) Invoked when a checkpoint has been aborted, i.e., when the checkpoint coordinator has received a decline message from one task and try to abort the targeted checkpoint by notification.notifyCheckpointCompleteAsync(long checkpointId) Invoked when a checkpoint has been completed, i.e., when the checkpoint coordinator has received the notification from all participating tasks.notifyCheckpointSubsumedAsync(long checkpointId) Invoked when a checkpoint has been subsumed, i.e., when the checkpoint coordinator has confirmed one checkpoint has been finished, and try to remove the first previous checkpoint.protected voidprotected voidprocessInput(MailboxDefaultAction.Controller controller) This method implements the default action of the task (e.g. processing one event from the input).final voidrestore()This method can be called beforeTaskInvokable.invoke()to restore an invokable object for the last valid state, if it has it.voidbooleanbooleanprotected voidsetSynchronousSavepoint(long checkpointId) protected org.apache.flink.metrics.CountersetupNumRecordsInCounter(StreamOperator streamOperator) toString()triggerCheckpointAsync(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions) This method is called to trigger a checkpoint, asynchronously by the checkpoint coordinator.voidtriggerCheckpointOnBarrier(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions, CheckpointMetricsBuilder checkpointMetrics) This method is called when a checkpoint is triggered as a result of receiving checkpoint barriers on all input streams.Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.flink.streaming.runtime.tasks.ContainingTaskDetails
getExecutionConfig, getIndexInSubtaskGroup, getJobConfiguration, getUserCodeClassLoader
-
Field Details
-
TRIGGER_THREAD_GROUP
The thread group that holds all trigger timer threads. -
LOG
protected static final org.slf4j.Logger LOGThe logger used by the StreamTask and its subclasses. -
inputProcessor
The input processor. Initialized ininit()method. -
mainOperator
the main operator that consumes the input streams of this task. -
operatorChain
The chain of operators executed by this task. -
configuration
The configuration of this streaming task. -
stateBackend
Our state backend. We use this to create a keyed state backend. -
checkpointStorage
Our checkpoint storage. We use this to create checkpoint streams. -
timerService
The internalTimerServiceused to define the current processing time (default =System.currentTimeMillis()) and register timers for tasks to be executed in the future. -
systemTimerService
In contrast totimerServicewe should not register any user timers here. It should be used only for system level timers. -
recordWriter
-
mailboxProcessor
-
-
Constructor Details
-
StreamTask
Constructor for initialization, possibly with initial state (recovery / savepoint / etc).- Parameters:
env- The task environment for this task.- Throws:
Exception
-
StreamTask
Constructor for initialization, possibly with initial state (recovery / savepoint / etc).- Parameters:
env- The task environment for this task.timerService- Optionally, a specific timer service to use.- Throws:
Exception
-
StreamTask
protected StreamTask(Environment environment, @Nullable TimerService timerService, Thread.UncaughtExceptionHandler uncaughtExceptionHandler) throws Exception - Throws:
Exception
-
StreamTask
protected StreamTask(Environment environment, @Nullable TimerService timerService, Thread.UncaughtExceptionHandler uncaughtExceptionHandler, StreamTaskActionExecutor actionExecutor) throws Exception Constructor for initialization, possibly with initial state (recovery / savepoint / etc).This constructor accepts a special
TimerService. By default (and if null is passes for the timer service) aDefaultTimerServicewill be used.- Parameters:
environment- The task environment for this task.timerService- Optionally, a specific timer service to use.uncaughtExceptionHandler- to handle uncaught exceptions in the async operations thread poolactionExecutor- a mean to wrap all actions performed by this task thread. Currently, only SynchronizedActionExecutor can be used to preserve locking semantics.- Throws:
Exception
-
StreamTask
protected StreamTask(Environment environment, @Nullable TimerService timerService, Thread.UncaughtExceptionHandler uncaughtExceptionHandler, StreamTaskActionExecutor actionExecutor, TaskMailbox mailbox) throws Exception - Throws:
Exception
-
-
Method Details
-
init
- Throws:
Exception
-
cancelTask
- Throws:
Exception
-
processInput
This method implements the default action of the task (e.g. processing one event from the input). Implementations should (in general) be non-blocking.- Parameters:
controller- controller object for collaborative interaction between the action and the stream task.- Throws:
Exception- on any problems in the action.
-
endData
- Throws:
Exception
-
notifyEndOfData
protected void notifyEndOfData() -
setSynchronousSavepoint
protected void setSynchronousSavepoint(long checkpointId) -
advanceToEndOfEventTime
Emits theMAX_WATERMARKso that all registered timers are fired.This is used by the source task when the job is
TERMINATED. In the case, we want all the timers registered throughout the pipeline to fire and the related state (e.g. windows) to be flushed.For tasks other than the source task, this method does nothing.
- Throws:
Exception
-
createStreamTaskStateInitializer
public StreamTaskStateInitializer createStreamTaskStateInitializer(SubTaskInitializationMetricsBuilder initializationMetrics) -
setupNumRecordsInCounter
-
restore
Description copied from interface:TaskInvokableThis method can be called beforeTaskInvokable.invoke()to restore an invokable object for the last valid state, if it has it.If
TaskInvokable.invoke()is not called after this method for some reason (e.g. task cancellation); then all resources should be cleaned up by callingTaskInvokable.cleanUp(Throwable)()} after the method returns.- Specified by:
restorein interfaceTaskInvokable- Throws:
Exception
-
invoke
Description copied from interface:TaskInvokableStarts the execution.This method is called by the task manager when the actual execution of the task starts.
All resources should be cleaned up by calling
TaskInvokable.cleanUp(Throwable)after the method returns.- Specified by:
invokein interfaceTaskInvokable- Throws:
Exception
-
runSingleMailboxLoop
- Throws:
Exception
-
runMailboxStep
- Throws:
Exception
-
isMailboxLoopRunning
@VisibleForTesting public boolean isMailboxLoopRunning() -
runMailboxLoop
- Throws:
Exception
-
afterInvoke
- Throws:
Exception
-
cleanUp
Description copied from interface:TaskInvokableCleanup any resources used inTaskInvokable.invoke()ORTaskInvokable.restore(). This method must be called regardless whether the aforementioned calls succeeded or failed.- Specified by:
cleanUpin interfaceTaskInvokable- Parameters:
throwable- iff failure happened during the execution ofTaskInvokable.restore()orTaskInvokable.invoke(), null otherwise.ATTENTION:
CancelTaskExceptionshould not be treated as a failure.- Throws:
Exception
-
cleanUpInternal
- Throws:
Exception
-
getCompletionFuture
-
cancel
Description copied from interface:TaskInvokableThis method is called when a task is canceled either as a result of a user abort or an execution failure. It can be overwritten to respond to shut down the user code properly.- Specified by:
cancelin interfaceTaskInvokable- Throws:
Exception
-
getMailboxExecutorFactory
-
hasMail
public boolean hasMail() -
getCanEmitBatchOfRecords
-
isRunning
public final boolean isRunning() -
isCanceled
public final boolean isCanceled() -
isFailing
public final boolean isFailing() -
finalize
The finalize method shuts down the timer. This is a fail-safe shutdown, in case the original shutdown method was never called.This should not be relied upon! It will cause shutdown to happen much later than if manual shutdown is attempted, and cause threads to linger for longer than needed.
-
getName
Gets the name of the task, in the form "taskname (2/5)".- Returns:
- The name of the task.
-
getCheckpointStorage
-
getConfiguration
-
triggerCheckpointAsync
public CompletableFuture<Boolean> triggerCheckpointAsync(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions) Description copied from interface:CheckpointableTaskThis method is called to trigger a checkpoint, asynchronously by the checkpoint coordinator.This method is called for tasks that start the checkpoints by injecting the initial barriers, i.e., the source tasks. In contrast, checkpoints on downstream operators, which are the result of receiving checkpoint barriers, invoke the
CheckpointableTask.triggerCheckpointOnBarrier(CheckpointMetaData, CheckpointOptions, CheckpointMetricsBuilder)method.- Specified by:
triggerCheckpointAsyncin interfaceCheckpointableTask- Parameters:
checkpointMetaData- Meta data for about this checkpointcheckpointOptions- Options for performing this checkpoint- Returns:
- future with value of
falseif the checkpoint was not carried out,trueotherwise
-
getCheckpointBarrierHandler
Acquires the optionalCheckpointBarrierHandlerassociated with this stream task. TheCheckpointBarrierHandlershould exist if the task has data inputs and requires to align the barriers. -
triggerCheckpointOnBarrier
public void triggerCheckpointOnBarrier(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions, CheckpointMetricsBuilder checkpointMetrics) throws IOException Description copied from interface:CheckpointableTaskThis method is called when a checkpoint is triggered as a result of receiving checkpoint barriers on all input streams.- Specified by:
triggerCheckpointOnBarrierin interfaceCheckpointableTask- Parameters:
checkpointMetaData- Meta data for about this checkpointcheckpointOptions- Options for performing this checkpointcheckpointMetrics- Metrics about this checkpoint- Throws:
IOException- Exceptions thrown as the result of triggering a checkpoint are forwarded.
-
abortCheckpointOnBarrier
public void abortCheckpointOnBarrier(long checkpointId, CheckpointException cause) throws IOException Description copied from interface:CheckpointableTaskAborts a checkpoint as the result of receiving possibly some checkpoint barriers, but at least oneCancelCheckpointMarker.This requires implementing tasks to forward a
CancelCheckpointMarkerto their outputs.- Specified by:
abortCheckpointOnBarrierin interfaceCheckpointableTask- Parameters:
checkpointId- The ID of the checkpoint to be aborted.cause- The reason why the checkpoint was aborted during alignment- Throws:
IOException
-
declineCheckpoint
protected void declineCheckpoint(long checkpointId) -
getAsyncOperationsThreadPool
-
notifyCheckpointCompleteAsync
Description copied from interface:CheckpointableTaskInvoked when a checkpoint has been completed, i.e., when the checkpoint coordinator has received the notification from all participating tasks.- Specified by:
notifyCheckpointCompleteAsyncin interfaceCheckpointableTask- Parameters:
checkpointId- The ID of the checkpoint that is complete.- Returns:
- future that completes when the notification has been processed by the task.
-
notifyCheckpointAbortAsync
Description copied from interface:CheckpointableTaskInvoked when a checkpoint has been aborted, i.e., when the checkpoint coordinator has received a decline message from one task and try to abort the targeted checkpoint by notification.- Specified by:
notifyCheckpointAbortAsyncin interfaceCheckpointableTask- Parameters:
checkpointId- The ID of the checkpoint that is aborted.latestCompletedCheckpointId- The ID of the latest completed checkpoint.- Returns:
- future that completes when the notification has been processed by the task.
-
notifyCheckpointSubsumedAsync
Description copied from interface:CheckpointableTaskInvoked when a checkpoint has been subsumed, i.e., when the checkpoint coordinator has confirmed one checkpoint has been finished, and try to remove the first previous checkpoint.- Specified by:
notifyCheckpointSubsumedAsyncin interfaceCheckpointableTask- Parameters:
checkpointId- The ID of the checkpoint that is subsumed.- Returns:
- future that completes when the notification has been processed by the task.
-
dispatchOperatorEvent
public void dispatchOperatorEvent(OperatorID operator, org.apache.flink.util.SerializedValue<OperatorEvent> event) throws org.apache.flink.util.FlinkException - Specified by:
dispatchOperatorEventin interfaceCoordinatedTask- Throws:
org.apache.flink.util.FlinkException
-
getProcessingTimeServiceFactory
-
handleAsyncException
Handles an exception thrown by another thread (e.g. a TriggerTask), other than the one executing the main task by failing the task entirely.In more detail, it marks task execution failed for an external reason (a reason other than the task code itself throwing an exception). If the task is already in a terminal state (such as FINISHED, CANCELED, FAILED), or if the task is already canceling this does nothing. Otherwise it sets the state to FAILED, and, if the invokable code is running, starts an asynchronous thread that aborts that code.
This method never blocks.
- Specified by:
handleAsyncExceptionin interfaceAsyncExceptionHandler
-
toString
-
getCancelables
public final org.apache.flink.core.fs.CloseableRegistry getCancelables() -
createRecordWriterDelegate
@VisibleForTesting public static <OUT> RecordWriterDelegate<SerializationDelegate<StreamRecord<OUT>>> createRecordWriterDelegate(StreamConfig configuration, Environment environment) -
getAsyncCheckpointStartDelayNanos
protected long getAsyncCheckpointStartDelayNanos() -
isUsingNonBlockingInput
public boolean isUsingNonBlockingInput()- Specified by:
isUsingNonBlockingInputin interfaceTaskInvokable- Returns:
- true if blocking input such as
InputGate.getNext()is used (as opposed toInputGate.pollNext(). To be removed together with the DataSet API.
-
maybeInterruptOnCancel
public void maybeInterruptOnCancel(Thread toInterrupt, @Nullable String taskName, @Nullable Long timeout) Description copied from interface:TaskInvokableChecks whether the task should be interrupted during cancellation and if so, execute the specifiedRunnable interruptAction.- Specified by:
maybeInterruptOnCancelin interfaceTaskInvokabletaskName- optional taskName to log stack tracetimeout- optional timeout to log stack trace
-
getEnvironment
-