Class StreamTask<OUT,OP extends StreamOperator<OUT>>

java.lang.Object
org.apache.flink.streaming.runtime.tasks.StreamTask<OUT,OP>
Type Parameters:
OUT -
OP -
All Implemented Interfaces:
CheckpointableTask, CoordinatedTask, TaskInvokable, AsyncExceptionHandler, ContainingTaskDetails
Direct Known Subclasses:
AbstractTwoInputStreamTask, MultipleInputStreamTask, OneInputStreamTask, SourceOperatorStreamTask, SourceStreamTask

@Internal public abstract class StreamTask<OUT,OP extends StreamOperator<OUT>> extends Object implements TaskInvokable, CheckpointableTask, CoordinatedTask, AsyncExceptionHandler, ContainingTaskDetails
Base class for all streaming tasks. A task is the unit of local processing that is deployed and executed by the TaskManagers. Each task runs one or more StreamOperators which form the Task's operator chain. Operators that are chained together execute synchronously in the same thread and hence on the same stream partition. A common case for these chains are successive map/flatmap/filter tasks.

The task chain contains one "head" operator and multiple chained operators. The StreamTask is specialized for the type of the head operator: one-input and two-input tasks, as well as for sources, iteration heads and iteration tails.

The Task class deals with the setup of the streams read by the head operator, and the streams produced by the operators at the ends of the operator chain. Note that the chain may fork and thus have multiple ends.

The life cycle of the task is set up as follows:


 -- setInitialState -> provides state of all operators in the chain

 -- invoke()
       |
       +----> Create basic utils (config, etc) and load the chain of operators
       +----> operators.setup()
       +----> task specific init()
       +----> initialize-operator-states()
       +----> open-operators()
       +----> run()
       +----> finish-operators()
       +----> close-operators()
       +----> common cleanup
       +----> task specific cleanup()
 

The StreamTask has a lock object called lock. All calls to methods on a StreamOperator must be synchronized on this lock object to ensure that no methods are called concurrently.

  • Field Details

    • TRIGGER_THREAD_GROUP

      public static final ThreadGroup TRIGGER_THREAD_GROUP
      The thread group that holds all trigger timer threads.
    • LOG

      protected static final org.slf4j.Logger LOG
      The logger used by the StreamTask and its subclasses.
    • inputProcessor

      @Nullable protected StreamInputProcessor inputProcessor
      The input processor. Initialized in init() method.
    • mainOperator

      protected OP extends StreamOperator<OUT> mainOperator
      the main operator that consumes the input streams of this task.
    • operatorChain

      protected OperatorChain<OUT,OP extends StreamOperator<OUT>> operatorChain
      The chain of operators executed by this task.
    • configuration

      protected final StreamConfig configuration
      The configuration of this streaming task.
    • stateBackend

      protected final StateBackend stateBackend
      Our state backend. We use this to create a keyed state backend.
    • checkpointStorage

      protected final CheckpointStorage checkpointStorage
      Our checkpoint storage. We use this to create checkpoint streams.
    • timerService

      protected final TimerService timerService
      The internal TimerService used to define the current processing time (default = System.currentTimeMillis()) and register timers for tasks to be executed in the future.
    • systemTimerService

      protected final TimerService systemTimerService
      In contrast to timerService we should not register any user timers here. It should be used only for system level timers.
    • recordWriter

    • mailboxProcessor

      protected final MailboxProcessor mailboxProcessor
  • Constructor Details

  • Method Details

    • init

      protected abstract void init() throws Exception
      Throws:
      Exception
    • cancelTask

      protected void cancelTask() throws Exception
      Throws:
      Exception
    • processInput

      protected void processInput(MailboxDefaultAction.Controller controller) throws Exception
      This method implements the default action of the task (e.g. processing one event from the input). Implementations should (in general) be non-blocking.
      Parameters:
      controller - controller object for collaborative interaction between the action and the stream task.
      Throws:
      Exception - on any problems in the action.
    • endData

      protected void endData(StopMode mode) throws Exception
      Throws:
      Exception
    • notifyEndOfData

      protected void notifyEndOfData()
    • setSynchronousSavepoint

      protected void setSynchronousSavepoint(long checkpointId)
    • advanceToEndOfEventTime

      protected void advanceToEndOfEventTime() throws Exception
      Emits the MAX_WATERMARK so that all registered timers are fired.

      This is used by the source task when the job is TERMINATED. In the case, we want all the timers registered throughout the pipeline to fire and the related state (e.g. windows) to be flushed.

      For tasks other than the source task, this method does nothing.

      Throws:
      Exception
    • createStreamTaskStateInitializer

      public StreamTaskStateInitializer createStreamTaskStateInitializer(SubTaskInitializationMetricsBuilder initializationMetrics)
    • setupNumRecordsInCounter

      protected org.apache.flink.metrics.Counter setupNumRecordsInCounter(StreamOperator streamOperator)
    • restore

      public final void restore() throws Exception
      Description copied from interface: TaskInvokable
      This method can be called before TaskInvokable.invoke() to restore an invokable object for the last valid state, if it has it.

      If TaskInvokable.invoke() is not called after this method for some reason (e.g. task cancellation); then all resources should be cleaned up by calling TaskInvokable.cleanUp(Throwable) ()} after the method returns.

      Specified by:
      restore in interface TaskInvokable
      Throws:
      Exception
    • invoke

      public final void invoke() throws Exception
      Description copied from interface: TaskInvokable
      Starts the execution.

      This method is called by the task manager when the actual execution of the task starts.

      All resources should be cleaned up by calling TaskInvokable.cleanUp(Throwable) after the method returns.

      Specified by:
      invoke in interface TaskInvokable
      Throws:
      Exception
    • runSingleMailboxLoop

      @VisibleForTesting public boolean runSingleMailboxLoop() throws Exception
      Throws:
      Exception
    • runMailboxStep

      @VisibleForTesting public boolean runMailboxStep() throws Exception
      Throws:
      Exception
    • isMailboxLoopRunning

      @VisibleForTesting public boolean isMailboxLoopRunning()
    • runMailboxLoop

      public void runMailboxLoop() throws Exception
      Throws:
      Exception
    • afterInvoke

      protected void afterInvoke() throws Exception
      Throws:
      Exception
    • cleanUp

      public final void cleanUp(Throwable throwable) throws Exception
      Description copied from interface: TaskInvokable
      Cleanup any resources used in TaskInvokable.invoke() OR TaskInvokable.restore(). This method must be called regardless whether the aforementioned calls succeeded or failed.
      Specified by:
      cleanUp in interface TaskInvokable
      Parameters:
      throwable - iff failure happened during the execution of TaskInvokable.restore() or TaskInvokable.invoke(), null otherwise.

      ATTENTION: CancelTaskException should not be treated as a failure.

      Throws:
      Exception
    • cleanUpInternal

      protected void cleanUpInternal() throws Exception
      Throws:
      Exception
    • getCompletionFuture

      protected CompletableFuture<Void> getCompletionFuture()
    • cancel

      public final void cancel() throws Exception
      Description copied from interface: TaskInvokable
      This method is called when a task is canceled either as a result of a user abort or an execution failure. It can be overwritten to respond to shut down the user code properly.
      Specified by:
      cancel in interface TaskInvokable
      Throws:
      Exception
    • getMailboxExecutorFactory

      public MailboxExecutorFactory getMailboxExecutorFactory()
    • hasMail

      public boolean hasMail()
    • getCanEmitBatchOfRecords

      public StreamTask.CanEmitBatchOfRecordsChecker getCanEmitBatchOfRecords()
    • isRunning

      public final boolean isRunning()
    • isCanceled

      public final boolean isCanceled()
    • isFailing

      public final boolean isFailing()
    • finalize

      protected void finalize() throws Throwable
      The finalize method shuts down the timer. This is a fail-safe shutdown, in case the original shutdown method was never called.

      This should not be relied upon! It will cause shutdown to happen much later than if manual shutdown is attempted, and cause threads to linger for longer than needed.

      Overrides:
      finalize in class Object
      Throws:
      Throwable
    • getName

      public final String getName()
      Gets the name of the task, in the form "taskname (2/5)".
      Returns:
      The name of the task.
    • getCheckpointStorage

      public CheckpointStorageWorkerView getCheckpointStorage()
    • getConfiguration

      public StreamConfig getConfiguration()
    • triggerCheckpointAsync

      public CompletableFuture<Boolean> triggerCheckpointAsync(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions)
      Description copied from interface: CheckpointableTask
      This method is called to trigger a checkpoint, asynchronously by the checkpoint coordinator.

      This method is called for tasks that start the checkpoints by injecting the initial barriers, i.e., the source tasks. In contrast, checkpoints on downstream operators, which are the result of receiving checkpoint barriers, invoke the CheckpointableTask.triggerCheckpointOnBarrier(CheckpointMetaData, CheckpointOptions, CheckpointMetricsBuilder) method.

      Specified by:
      triggerCheckpointAsync in interface CheckpointableTask
      Parameters:
      checkpointMetaData - Meta data for about this checkpoint
      checkpointOptions - Options for performing this checkpoint
      Returns:
      future with value of false if the checkpoint was not carried out, true otherwise
    • getCheckpointBarrierHandler

      protected Optional<CheckpointBarrierHandler> getCheckpointBarrierHandler()
      Acquires the optional CheckpointBarrierHandler associated with this stream task. The CheckpointBarrierHandler should exist if the task has data inputs and requires to align the barriers.
    • triggerCheckpointOnBarrier

      public void triggerCheckpointOnBarrier(CheckpointMetaData checkpointMetaData, CheckpointOptions checkpointOptions, CheckpointMetricsBuilder checkpointMetrics) throws IOException
      Description copied from interface: CheckpointableTask
      This method is called when a checkpoint is triggered as a result of receiving checkpoint barriers on all input streams.
      Specified by:
      triggerCheckpointOnBarrier in interface CheckpointableTask
      Parameters:
      checkpointMetaData - Meta data for about this checkpoint
      checkpointOptions - Options for performing this checkpoint
      checkpointMetrics - Metrics about this checkpoint
      Throws:
      IOException - Exceptions thrown as the result of triggering a checkpoint are forwarded.
    • abortCheckpointOnBarrier

      public void abortCheckpointOnBarrier(long checkpointId, CheckpointException cause) throws IOException
      Description copied from interface: CheckpointableTask
      Aborts a checkpoint as the result of receiving possibly some checkpoint barriers, but at least one CancelCheckpointMarker.

      This requires implementing tasks to forward a CancelCheckpointMarker to their outputs.

      Specified by:
      abortCheckpointOnBarrier in interface CheckpointableTask
      Parameters:
      checkpointId - The ID of the checkpoint to be aborted.
      cause - The reason why the checkpoint was aborted during alignment
      Throws:
      IOException
    • declineCheckpoint

      protected void declineCheckpoint(long checkpointId)
    • getAsyncOperationsThreadPool

      public final ExecutorService getAsyncOperationsThreadPool()
    • notifyCheckpointCompleteAsync

      public Future<Void> notifyCheckpointCompleteAsync(long checkpointId)
      Description copied from interface: CheckpointableTask
      Invoked when a checkpoint has been completed, i.e., when the checkpoint coordinator has received the notification from all participating tasks.
      Specified by:
      notifyCheckpointCompleteAsync in interface CheckpointableTask
      Parameters:
      checkpointId - The ID of the checkpoint that is complete.
      Returns:
      future that completes when the notification has been processed by the task.
    • notifyCheckpointAbortAsync

      public Future<Void> notifyCheckpointAbortAsync(long checkpointId, long latestCompletedCheckpointId)
      Description copied from interface: CheckpointableTask
      Invoked when a checkpoint has been aborted, i.e., when the checkpoint coordinator has received a decline message from one task and try to abort the targeted checkpoint by notification.
      Specified by:
      notifyCheckpointAbortAsync in interface CheckpointableTask
      Parameters:
      checkpointId - The ID of the checkpoint that is aborted.
      latestCompletedCheckpointId - The ID of the latest completed checkpoint.
      Returns:
      future that completes when the notification has been processed by the task.
    • notifyCheckpointSubsumedAsync

      public Future<Void> notifyCheckpointSubsumedAsync(long checkpointId)
      Description copied from interface: CheckpointableTask
      Invoked when a checkpoint has been subsumed, i.e., when the checkpoint coordinator has confirmed one checkpoint has been finished, and try to remove the first previous checkpoint.
      Specified by:
      notifyCheckpointSubsumedAsync in interface CheckpointableTask
      Parameters:
      checkpointId - The ID of the checkpoint that is subsumed.
      Returns:
      future that completes when the notification has been processed by the task.
    • dispatchOperatorEvent

      public void dispatchOperatorEvent(OperatorID operator, org.apache.flink.util.SerializedValue<OperatorEvent> event) throws org.apache.flink.util.FlinkException
      Specified by:
      dispatchOperatorEvent in interface CoordinatedTask
      Throws:
      org.apache.flink.util.FlinkException
    • getProcessingTimeServiceFactory

      public ProcessingTimeServiceFactory getProcessingTimeServiceFactory()
    • handleAsyncException

      public void handleAsyncException(String message, Throwable exception)
      Handles an exception thrown by another thread (e.g. a TriggerTask), other than the one executing the main task by failing the task entirely.

      In more detail, it marks task execution failed for an external reason (a reason other than the task code itself throwing an exception). If the task is already in a terminal state (such as FINISHED, CANCELED, FAILED), or if the task is already canceling this does nothing. Otherwise it sets the state to FAILED, and, if the invokable code is running, starts an asynchronous thread that aborts that code.

      This method never blocks.

      Specified by:
      handleAsyncException in interface AsyncExceptionHandler
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • getCancelables

      public final org.apache.flink.core.fs.CloseableRegistry getCancelables()
    • createRecordWriterDelegate

      @VisibleForTesting public static <OUT> RecordWriterDelegate<SerializationDelegate<StreamRecord<OUT>>> createRecordWriterDelegate(StreamConfig configuration, Environment environment)
    • getAsyncCheckpointStartDelayNanos

      protected long getAsyncCheckpointStartDelayNanos()
    • isUsingNonBlockingInput

      public boolean isUsingNonBlockingInput()
      Specified by:
      isUsingNonBlockingInput in interface TaskInvokable
      Returns:
      true if blocking input such as InputGate.getNext() is used (as opposed to InputGate.pollNext(). To be removed together with the DataSet API.
    • maybeInterruptOnCancel

      public void maybeInterruptOnCancel(Thread toInterrupt, @Nullable String taskName, @Nullable Long timeout)
      Description copied from interface: TaskInvokable
      Checks whether the task should be interrupted during cancellation and if so, execute the specified Runnable interruptAction.
      Specified by:
      maybeInterruptOnCancel in interface TaskInvokable
      taskName - optional taskName to log stack trace
      timeout - optional timeout to log stack trace
    • getEnvironment

      public final Environment getEnvironment()