Class BatchTask<S extends org.apache.flink.api.common.functions.Function,OT>

java.lang.Object
org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable
org.apache.flink.runtime.operators.BatchTask<S,OT>
All Implemented Interfaces:
CheckpointableTask, CoordinatedTask, TaskInvokable, TaskContext<S,OT>
Direct Known Subclasses:
AbstractIterativeTask

public class BatchTask<S extends org.apache.flink.api.common.functions.Function,OT> extends AbstractInvokable implements TaskContext<S,OT>
The base class for all batch tasks. Encapsulated common behavior and implements the main life-cycle of the user code.
  • Field Details

    • LOG

      protected static final org.slf4j.Logger LOG
    • driver

      protected volatile Driver<S extends org.apache.flink.api.common.functions.Function,OT> driver
      The driver that invokes the user code (the stub implementation). The central driver in this task (further drivers may be chained behind this driver).
    • stub

      protected S extends org.apache.flink.api.common.functions.Function stub
      The instantiated user code of this task's main operator (driver). May be null if the operator has no udf.
    • runtimeUdfContext

      protected DistributedRuntimeUDFContext runtimeUdfContext
      The udf's runtime context.
    • output

      protected org.apache.flink.util.Collector<OT> output
      The collector that forwards the user code's results. May forward to a channel or to chained drivers within this task.
    • eventualOutputs

      protected List<RecordWriter<?>> eventualOutputs
      The output writers for the data that this task forwards to the next task. The latest driver (the central, if no chained drivers exist, otherwise the last chained driver) produces its output to these writers.
    • inputReaders

      protected MutableReader<?>[] inputReaders
      The input readers of this task.
    • broadcastInputReaders

      protected MutableReader<?>[] broadcastInputReaders
      The input readers for the configured broadcast variables for this task.
    • inputIterators

      protected org.apache.flink.util.MutableObjectIterator<?>[] inputIterators
      The inputs reader, wrapped in an iterator. Prior to the local strategies, etc...
    • iterativeInputs

      protected int[] iterativeInputs
      The indices of the iterative inputs. Empty, if the task is not iterative.
    • iterativeBroadcastInputs

      protected int[] iterativeBroadcastInputs
      The indices of the iterative broadcast inputs. Empty, if non of the inputs is iterative.
    • localStrategies

      protected volatile CloseableInputProvider<?>[] localStrategies
      The local strategies that are applied on the inputs.
    • tempBarriers

      protected volatile TempBarrier<?>[] tempBarriers
      The optional temp barriers on the inputs for dead-lock breaking. Are optionally resettable.
    • resettableInputs

      protected volatile SpillingResettableMutableObjectIterator<?>[] resettableInputs
      The resettable inputs in the case where no temp barrier is needed.
    • inputs

      protected org.apache.flink.util.MutableObjectIterator<?>[] inputs
      The inputs to the operator. Return the readers' data after the application of the local strategy and the temp-table barrier.
    • inputSerializers

      protected org.apache.flink.api.common.typeutils.TypeSerializerFactory<?>[] inputSerializers
      The serializers for the input data type.
    • broadcastInputSerializers

      protected org.apache.flink.api.common.typeutils.TypeSerializerFactory<?>[] broadcastInputSerializers
      The serializers for the broadcast input data types.
    • inputComparators

      protected org.apache.flink.api.common.typeutils.TypeComparator<?>[] inputComparators
      The comparators for the central driver.
    • config

      protected TaskConfig config
      The task configuration with the setup parameters.
    • chainedTasks

      protected ArrayList<ChainedDriver<?,?>> chainedTasks
      A list of chained drivers, if there are any.
    • running

      protected volatile boolean running
      The flag that tags the task as still running. Checked periodically to abort processing.
    • accumulatorMap

      protected Map<String,org.apache.flink.api.common.accumulators.Accumulator<?,?>> accumulatorMap
      The accumulator map used in the RuntimeContext.
  • Constructor Details

    • BatchTask

      public BatchTask(Environment environment)
      Create an Invokable task and set its environment.
      Parameters:
      environment - The environment assigned to this invokable.
  • Method Details

    • invoke

      public void invoke() throws Exception
      The main work method.
      Specified by:
      invoke in interface TaskInvokable
      Specified by:
      invoke in class AbstractInvokable
      Throws:
      Exception
    • cancel

      public void cancel() throws Exception
      Description copied from interface: TaskInvokable
      This method is called when a task is canceled either as a result of a user abort or an execution failure. It can be overwritten to respond to shut down the user code properly.
      Specified by:
      cancel in interface TaskInvokable
      Overrides:
      cancel in class AbstractInvokable
      Throws:
      Exception
    • initialize

      protected void initialize() throws Exception
      Throws:
      Exception
    • readAndSetBroadcastInput

      protected <X> void readAndSetBroadcastInput(int inputNum, String bcVarName, DistributedRuntimeUDFContext context, int superstep) throws IOException
      Throws:
      IOException
    • releaseBroadcastVariables

      protected void releaseBroadcastVariables(String bcVarName, int superstep, DistributedRuntimeUDFContext context)
    • run

      protected void run() throws Exception
      Throws:
      Exception
    • closeLocalStrategiesAndCaches

      protected void closeLocalStrategiesAndCaches()
    • getLastOutputCollector

      protected org.apache.flink.util.Collector<OT> getLastOutputCollector()
      Returns:
      the last output collector in the collector chain
    • setLastOutputCollector

      protected void setLastOutputCollector(org.apache.flink.util.Collector<OT> newOutputCollector)
      Sets the last output Collector of the collector chain of this BatchTask.

      In case of chained tasks, the output collector of the last ChainedDriver is set. Otherwise it is the single collector of the BatchTask.

      Parameters:
      newOutputCollector - new output collector to set as last collector
    • getLastTasksConfig

      public TaskConfig getLastTasksConfig()
    • initStub

      protected S initStub(Class<? super S> stubSuperClass) throws Exception
      Throws:
      Exception
    • initInputReaders

      protected void initInputReaders() throws Exception
      Creates the record readers for the number of inputs as defined by getNumTaskInputs(). This method requires that the task configuration, the driver, and the user-code class loader are set.
      Throws:
      Exception
    • initBroadcastInputReaders

      protected void initBroadcastInputReaders() throws Exception
      Creates the record readers for the extra broadcast inputs as configured by TaskConfig.getNumBroadcastInputs(). This method requires that the task configuration, the driver, and the user-code class loader are set.
      Throws:
      Exception
    • initInputsSerializersAndComparators

      protected void initInputsSerializersAndComparators(int numInputs, int numComparators)
      Creates all the serializers and comparators.
    • initBroadcastInputsSerializers

      protected void initBroadcastInputsSerializers(int numBroadcastInputs)
      Creates all the serializers and iterators for the broadcast inputs.
    • initLocalStrategies

      protected void initLocalStrategies(int numInputs) throws Exception
      NOTE: This method must be invoked after the invocation of #initInputReaders() and #initInputSerializersAndComparators(int)!
      Throws:
      Exception
    • resetAllInputs

      protected void resetAllInputs() throws Exception
      Throws:
      Exception
    • excludeFromReset

      protected void excludeFromReset(int inputNum)
    • createInputIterator

      protected org.apache.flink.util.MutableObjectIterator<?> createInputIterator(MutableReader<?> inputReader, org.apache.flink.api.common.typeutils.TypeSerializerFactory<?> serializerFactory)
    • getNumTaskInputs

      protected int getNumTaskInputs()
    • initOutputs

      protected void initOutputs() throws Exception
      Creates a writer for each output. Creates an OutputCollector which forwards its input to all writers. The output collector applies the configured shipping strategies for each writer.
      Throws:
      Exception
    • createRuntimeContext

      public DistributedRuntimeUDFContext createRuntimeContext(org.apache.flink.metrics.groups.OperatorMetricGroup metrics)
    • getTaskConfig

      public TaskConfig getTaskConfig()
      Specified by:
      getTaskConfig in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getTaskManagerInfo

      public TaskManagerRuntimeInfo getTaskManagerInfo()
      Specified by:
      getTaskManagerInfo in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getMemoryManager

      public MemoryManager getMemoryManager()
      Specified by:
      getMemoryManager in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getIOManager

      public IOManager getIOManager()
      Specified by:
      getIOManager in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getStub

      public S getStub()
      Specified by:
      getStub in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getOutputCollector

      public org.apache.flink.util.Collector<OT> getOutputCollector()
      Specified by:
      getOutputCollector in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getContainingTask

      public AbstractInvokable getContainingTask()
      Specified by:
      getContainingTask in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • formatLogString

      public String formatLogString(String message)
      Specified by:
      formatLogString in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getMetricGroup

      public org.apache.flink.metrics.groups.OperatorMetricGroup getMetricGroup()
      Specified by:
      getMetricGroup in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getInput

      public <X> org.apache.flink.util.MutableObjectIterator<X> getInput(int index)
      Specified by:
      getInput in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getInputSerializer

      public <X> org.apache.flink.api.common.typeutils.TypeSerializerFactory<X> getInputSerializer(int index)
      Specified by:
      getInputSerializer in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • getDriverComparator

      public <X> org.apache.flink.api.common.typeutils.TypeComparator<X> getDriverComparator(int index)
      Specified by:
      getDriverComparator in interface TaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
    • constructLogString

      public static String constructLogString(String message, String taskName, AbstractInvokable parent)
      Utility function that composes a string for logging purposes. The string includes the given message, the given name of the task and the index in its subtask group as well as the number of instances that exist in its subtask group.
      Parameters:
      message - The main message for the log.
      taskName - The name of the task.
      parent - The task that contains the code producing the message.
      Returns:
      The string for logging.
    • logAndThrowException

      public static void logAndThrowException(Exception ex, AbstractInvokable parent) throws Exception
      Prints an error message and throws the given exception. If the exception is of the type ExceptionInChainedStubException then the chain of contained exceptions is followed until an exception of a different type is found.
      Parameters:
      ex - The exception to be thrown.
      parent - The parent task, whose information is included in the log message.
      Throws:
      Exception - Always thrown.
    • getOutputCollector

      public static <T> org.apache.flink.util.Collector<T> getOutputCollector(AbstractInvokable task, TaskConfig config, ClassLoader cl, List<RecordWriter<?>> eventualOutputs, int outputOffset, int numOutputs) throws Exception
      Creates the Collector for the given task, as described by the given configuration. The output collector contains the writers that forward the data to the different tasks that the given task is connected to. Each writer applies the partitioning as described in the configuration.
      Parameters:
      task - The task that the output collector is created for.
      config - The configuration describing the output shipping strategies.
      cl - The classloader used to load user defined types.
      eventualOutputs - The output writers that this task forwards to the next task for each output.
      outputOffset - The offset to start to get the writers for the outputs
      numOutputs - The number of outputs described in the configuration.
      Returns:
      The OutputCollector that data produced in this task is submitted to.
      Throws:
      Exception
    • initOutputs

      public static <T> org.apache.flink.util.Collector<T> initOutputs(AbstractInvokable containingTask, org.apache.flink.util.UserCodeClassLoader cl, TaskConfig config, List<ChainedDriver<?,?>> chainedTasksTarget, List<RecordWriter<?>> eventualOutputs, org.apache.flink.api.common.ExecutionConfig executionConfig, Map<String,org.apache.flink.api.common.accumulators.Accumulator<?,?>> accumulatorMap) throws Exception
      Creates a writer for each output. Creates an OutputCollector which forwards its input to all writers. The output collector applies the configured shipping strategy.
      Throws:
      Exception
    • openUserCode

      public static void openUserCode(org.apache.flink.api.common.functions.Function stub, org.apache.flink.configuration.Configuration parameters) throws Exception
      Opens the given stub using its RichFunction.open(OpenContext) method. If the open call produces an exception, a new exception with a standard error message is created, using the encountered exception as its cause.
      Parameters:
      stub - The user code instance to be opened.
      parameters - The parameters supplied to the user code.
      Throws:
      Exception - Thrown, if the user code's open method produces an exception.
    • closeUserCode

      public static void closeUserCode(org.apache.flink.api.common.functions.Function stub) throws Exception
      Closes the given stub using its RichFunction.close() method. If the close call produces an exception, a new exception with a standard error message is created, using the encountered exception as its cause.
      Parameters:
      stub - The user code instance to be closed.
      Throws:
      Exception - Thrown, if the user code's close method produces an exception.
    • openChainedTasks

      public static void openChainedTasks(List<ChainedDriver<?,?>> tasks, AbstractInvokable parent) throws Exception
      Opens all chained tasks, in the order as they are stored in the array. The opening process creates a standardized log info message.
      Parameters:
      tasks - The tasks to be opened.
      parent - The parent task, used to obtain parameters to include in the log message.
      Throws:
      Exception - Thrown, if the opening encounters an exception.
    • closeChainedTasks

      public static void closeChainedTasks(List<ChainedDriver<?,?>> tasks, AbstractInvokable parent) throws Exception
      Closes all chained tasks, in the order as they are stored in the array. The closing process creates a standardized log info message.
      Parameters:
      tasks - The tasks to be closed.
      parent - The parent task, used to obtain parameters to include in the log message.
      Throws:
      Exception - Thrown, if the closing encounters an exception.
    • cancelChainedTasks

      public static void cancelChainedTasks(List<ChainedDriver<?,?>> tasks)
      Cancels all tasks via their ChainedDriver.cancelTask() method. Any occurring exception and error is suppressed, such that the canceling method of every task is invoked in all cases.
      Parameters:
      tasks - The tasks to be canceled.
    • instantiateUserCode

      public static <T> T instantiateUserCode(TaskConfig config, ClassLoader cl, Class<? super T> superClass)
      Instantiates a user code class from is definition in the task configuration. The class is instantiated without arguments using the null-ary constructor. Instantiation will fail if this constructor does not exist or is not public.
      Type Parameters:
      T - The generic type of the user code class.
      Parameters:
      config - The task configuration containing the class description.
      cl - The class loader to be used to load the class.
      superClass - The super class that the user code class extends or implements, for type checking.
      Returns:
      An instance of the user code class.
    • clearWriters

      public static void clearWriters(List<RecordWriter<?>> writers)
    • clearReaders

      public static void clearReaders(MutableReader<?>[] readers)