Class BatchTask<S extends org.apache.flink.api.common.functions.Function,OT>
java.lang.Object
org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable
org.apache.flink.runtime.operators.BatchTask<S,OT>
- All Implemented Interfaces:
CheckpointableTask,CoordinatedTask,TaskInvokable,TaskContext<S,OT>
- Direct Known Subclasses:
AbstractIterativeTask
public class BatchTask<S extends org.apache.flink.api.common.functions.Function,OT>
extends AbstractInvokable
implements TaskContext<S,OT>
The base class for all batch tasks. Encapsulated common behavior and implements the main
life-cycle of the user code.
-
Field Summary
FieldsModifier and TypeFieldDescriptionThe accumulator map used in the RuntimeContext.protected MutableReader<?>[]The input readers for the configured broadcast variables for this task.protected org.apache.flink.api.common.typeutils.TypeSerializerFactory<?>[]The serializers for the broadcast input data types.protected ArrayList<ChainedDriver<?,?>> A list of chained drivers, if there are any.protected TaskConfigThe task configuration with the setup parameters.The driver that invokes the user code (the stub implementation).protected List<RecordWriter<?>>The output writers for the data that this task forwards to the next task.protected org.apache.flink.api.common.typeutils.TypeComparator<?>[]The comparators for the central driver.protected org.apache.flink.util.MutableObjectIterator<?>[]The inputs reader, wrapped in an iterator.protected MutableReader<?>[]The input readers of this task.protected org.apache.flink.util.MutableObjectIterator<?>[]The inputs to the operator.protected org.apache.flink.api.common.typeutils.TypeSerializerFactory<?>[]The serializers for the input data type.protected int[]The indices of the iterative broadcast inputs.protected int[]The indices of the iterative inputs.protected CloseableInputProvider<?>[]The local strategies that are applied on the inputs.protected static final org.slf4j.Loggerprotected org.apache.flink.util.Collector<OT>The collector that forwards the user code's results.protected SpillingResettableMutableObjectIterator<?>[]The resettable inputs in the case where no temp barrier is needed.protected booleanThe flag that tags the task as still running.protected DistributedRuntimeUDFContextThe udf's runtime context.protected SThe instantiated user code of this task's main operator (driver).protected TempBarrier<?>[]The optional temp barriers on the inputs for dead-lock breaking. -
Constructor Summary
ConstructorsConstructorDescriptionBatchTask(Environment environment) Create an Invokable task and set its environment. -
Method Summary
Modifier and TypeMethodDescriptionvoidcancel()This method is called when a task is canceled either as a result of a user abort or an execution failure.static voidcancelChainedTasks(List<ChainedDriver<?, ?>> tasks) Cancels all tasks via theirChainedDriver.cancelTask()method.static voidclearReaders(MutableReader<?>[] readers) static voidclearWriters(List<RecordWriter<?>> writers) static voidcloseChainedTasks(List<ChainedDriver<?, ?>> tasks, AbstractInvokable parent) Closes all chained tasks, in the order as they are stored in the array.protected voidstatic voidcloseUserCode(org.apache.flink.api.common.functions.Function stub) Closes the given stub using itsRichFunction.close()method.static StringconstructLogString(String message, String taskName, AbstractInvokable parent) Utility function that composes a string for logging purposes.protected org.apache.flink.util.MutableObjectIterator<?>createInputIterator(MutableReader<?> inputReader, org.apache.flink.api.common.typeutils.TypeSerializerFactory<?> serializerFactory) createRuntimeContext(org.apache.flink.metrics.groups.OperatorMetricGroup metrics) protected voidexcludeFromReset(int inputNum) formatLogString(String message) <X> org.apache.flink.api.common.typeutils.TypeComparator<X>getDriverComparator(int index) <X> org.apache.flink.util.MutableObjectIterator<X>getInput(int index) <X> org.apache.flink.api.common.typeutils.TypeSerializerFactory<X>getInputSerializer(int index) protected org.apache.flink.util.Collector<OT>org.apache.flink.metrics.groups.OperatorMetricGroupprotected intorg.apache.flink.util.Collector<OT>static <T> org.apache.flink.util.Collector<T>getOutputCollector(AbstractInvokable task, TaskConfig config, ClassLoader cl, List<RecordWriter<?>> eventualOutputs, int outputOffset, int numOutputs) Creates theCollectorfor the given task, as described by the given configuration.getStub()protected voidCreates the record readers for the extra broadcast inputs as configured byTaskConfig.getNumBroadcastInputs().protected voidinitBroadcastInputsSerializers(int numBroadcastInputs) Creates all the serializers and iterators for the broadcast inputs.protected voidprotected voidCreates the record readers for the number of inputs as defined bygetNumTaskInputs().protected voidinitInputsSerializersAndComparators(int numInputs, int numComparators) Creates all the serializers and comparators.protected voidinitLocalStrategies(int numInputs) NOTE: This method must be invoked after the invocation of#initInputReaders()and#initInputSerializersAndComparators(int)!protected voidCreates a writer for each output.static <T> org.apache.flink.util.Collector<T>initOutputs(AbstractInvokable containingTask, org.apache.flink.util.UserCodeClassLoader cl, TaskConfig config, List<ChainedDriver<?, ?>> chainedTasksTarget, List<RecordWriter<?>> eventualOutputs, org.apache.flink.api.common.ExecutionConfig executionConfig, Map<String, org.apache.flink.api.common.accumulators.Accumulator<?, ?>> accumulatorMap) Creates a writer for each output.protected Sstatic <T> TinstantiateUserCode(TaskConfig config, ClassLoader cl, Class<? super T> superClass) Instantiates a user code class from is definition in the task configuration.voidinvoke()The main work method.static voidlogAndThrowException(Exception ex, AbstractInvokable parent) Prints an error message and throws the given exception.static voidopenChainedTasks(List<ChainedDriver<?, ?>> tasks, AbstractInvokable parent) Opens all chained tasks, in the order as they are stored in the array.static voidopenUserCode(org.apache.flink.api.common.functions.Function stub, org.apache.flink.configuration.Configuration parameters) Opens the given stub using itsRichFunction.open(OpenContext)method.protected <X> voidreadAndSetBroadcastInput(int inputNum, String bcVarName, DistributedRuntimeUDFContext context, int superstep) protected voidreleaseBroadcastVariables(String bcVarName, int superstep, DistributedRuntimeUDFContext context) protected voidprotected voidrun()protected voidsetLastOutputCollector(org.apache.flink.util.Collector<OT> newOutputCollector) Sets the last outputCollectorof the collector chain of thisBatchTask.Methods inherited from class org.apache.flink.runtime.jobgraph.tasks.AbstractInvokable
abortCheckpointOnBarrier, cleanUp, dispatchOperatorEvent, getCurrentNumberOfSubtasks, getEnvironment, getExecutionConfig, getIndexInSubtaskGroup, getJobConfiguration, getTaskConfiguration, getUserCodeClassLoader, isUsingNonBlockingInput, maybeInterruptOnCancel, notifyCheckpointAbortAsync, notifyCheckpointCompleteAsync, notifyCheckpointSubsumedAsync, restore, triggerCheckpointAsync, triggerCheckpointOnBarrierMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.runtime.operators.TaskContext
getExecutionConfig, getUserCodeClassLoader
-
Field Details
-
LOG
protected static final org.slf4j.Logger LOG -
driver
The driver that invokes the user code (the stub implementation). The central driver in this task (further drivers may be chained behind this driver). -
stub
The instantiated user code of this task's main operator (driver). May be null if the operator has no udf. -
runtimeUdfContext
The udf's runtime context. -
output
The collector that forwards the user code's results. May forward to a channel or to chained drivers within this task. -
eventualOutputs
The output writers for the data that this task forwards to the next task. The latest driver (the central, if no chained drivers exist, otherwise the last chained driver) produces its output to these writers. -
inputReaders
The input readers of this task. -
broadcastInputReaders
The input readers for the configured broadcast variables for this task. -
inputIterators
protected org.apache.flink.util.MutableObjectIterator<?>[] inputIteratorsThe inputs reader, wrapped in an iterator. Prior to the local strategies, etc... -
iterativeInputs
protected int[] iterativeInputsThe indices of the iterative inputs. Empty, if the task is not iterative. -
iterativeBroadcastInputs
protected int[] iterativeBroadcastInputsThe indices of the iterative broadcast inputs. Empty, if non of the inputs is iterative. -
localStrategies
The local strategies that are applied on the inputs. -
tempBarriers
The optional temp barriers on the inputs for dead-lock breaking. Are optionally resettable. -
resettableInputs
The resettable inputs in the case where no temp barrier is needed. -
inputs
protected org.apache.flink.util.MutableObjectIterator<?>[] inputsThe inputs to the operator. Return the readers' data after the application of the local strategy and the temp-table barrier. -
inputSerializers
protected org.apache.flink.api.common.typeutils.TypeSerializerFactory<?>[] inputSerializersThe serializers for the input data type. -
broadcastInputSerializers
protected org.apache.flink.api.common.typeutils.TypeSerializerFactory<?>[] broadcastInputSerializersThe serializers for the broadcast input data types. -
inputComparators
protected org.apache.flink.api.common.typeutils.TypeComparator<?>[] inputComparatorsThe comparators for the central driver. -
config
The task configuration with the setup parameters. -
chainedTasks
A list of chained drivers, if there are any. -
running
protected volatile boolean runningThe flag that tags the task as still running. Checked periodically to abort processing. -
accumulatorMap
The accumulator map used in the RuntimeContext.
-
-
Constructor Details
-
BatchTask
Create an Invokable task and set its environment.- Parameters:
environment- The environment assigned to this invokable.
-
-
Method Details
-
invoke
The main work method.- Specified by:
invokein interfaceTaskInvokable- Specified by:
invokein classAbstractInvokable- Throws:
Exception
-
cancel
Description copied from interface:TaskInvokableThis method is called when a task is canceled either as a result of a user abort or an execution failure. It can be overwritten to respond to shut down the user code properly.- Specified by:
cancelin interfaceTaskInvokable- Overrides:
cancelin classAbstractInvokable- Throws:
Exception
-
initialize
- Throws:
Exception
-
readAndSetBroadcastInput
protected <X> void readAndSetBroadcastInput(int inputNum, String bcVarName, DistributedRuntimeUDFContext context, int superstep) throws IOException - Throws:
IOException
-
releaseBroadcastVariables
protected void releaseBroadcastVariables(String bcVarName, int superstep, DistributedRuntimeUDFContext context) -
run
- Throws:
Exception
-
closeLocalStrategiesAndCaches
protected void closeLocalStrategiesAndCaches() -
getLastOutputCollector
- Returns:
- the last output collector in the collector chain
-
setLastOutputCollector
Sets the last outputCollectorof the collector chain of thisBatchTask.In case of chained tasks, the output collector of the last
ChainedDriveris set. Otherwise it is the single collector of theBatchTask.- Parameters:
newOutputCollector- new output collector to set as last collector
-
getLastTasksConfig
-
initStub
- Throws:
Exception
-
initInputReaders
Creates the record readers for the number of inputs as defined bygetNumTaskInputs(). This method requires that the task configuration, the driver, and the user-code class loader are set.- Throws:
Exception
-
initBroadcastInputReaders
Creates the record readers for the extra broadcast inputs as configured byTaskConfig.getNumBroadcastInputs(). This method requires that the task configuration, the driver, and the user-code class loader are set.- Throws:
Exception
-
initInputsSerializersAndComparators
protected void initInputsSerializersAndComparators(int numInputs, int numComparators) Creates all the serializers and comparators. -
initBroadcastInputsSerializers
protected void initBroadcastInputsSerializers(int numBroadcastInputs) Creates all the serializers and iterators for the broadcast inputs. -
initLocalStrategies
NOTE: This method must be invoked after the invocation of#initInputReaders()and#initInputSerializersAndComparators(int)!- Throws:
Exception
-
resetAllInputs
- Throws:
Exception
-
excludeFromReset
protected void excludeFromReset(int inputNum) -
createInputIterator
protected org.apache.flink.util.MutableObjectIterator<?> createInputIterator(MutableReader<?> inputReader, org.apache.flink.api.common.typeutils.TypeSerializerFactory<?> serializerFactory) -
getNumTaskInputs
protected int getNumTaskInputs() -
initOutputs
Creates a writer for each output. Creates an OutputCollector which forwards its input to all writers. The output collector applies the configured shipping strategies for each writer.- Throws:
Exception
-
createRuntimeContext
public DistributedRuntimeUDFContext createRuntimeContext(org.apache.flink.metrics.groups.OperatorMetricGroup metrics) -
getTaskConfig
- Specified by:
getTaskConfigin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getTaskManagerInfo
- Specified by:
getTaskManagerInfoin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getMemoryManager
- Specified by:
getMemoryManagerin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getIOManager
- Specified by:
getIOManagerin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getStub
- Specified by:
getStubin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getOutputCollector
- Specified by:
getOutputCollectorin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getContainingTask
- Specified by:
getContainingTaskin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
formatLogString
- Specified by:
formatLogStringin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getMetricGroup
public org.apache.flink.metrics.groups.OperatorMetricGroup getMetricGroup()- Specified by:
getMetricGroupin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getInput
public <X> org.apache.flink.util.MutableObjectIterator<X> getInput(int index) - Specified by:
getInputin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getInputSerializer
public <X> org.apache.flink.api.common.typeutils.TypeSerializerFactory<X> getInputSerializer(int index) - Specified by:
getInputSerializerin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
getDriverComparator
public <X> org.apache.flink.api.common.typeutils.TypeComparator<X> getDriverComparator(int index) - Specified by:
getDriverComparatorin interfaceTaskContext<S extends org.apache.flink.api.common.functions.Function,OT>
-
constructLogString
Utility function that composes a string for logging purposes. The string includes the given message, the given name of the task and the index in its subtask group as well as the number of instances that exist in its subtask group.- Parameters:
message- The main message for the log.taskName- The name of the task.parent- The task that contains the code producing the message.- Returns:
- The string for logging.
-
logAndThrowException
Prints an error message and throws the given exception. If the exception is of the typeExceptionInChainedStubExceptionthen the chain of contained exceptions is followed until an exception of a different type is found.- Parameters:
ex- The exception to be thrown.parent- The parent task, whose information is included in the log message.- Throws:
Exception- Always thrown.
-
getOutputCollector
public static <T> org.apache.flink.util.Collector<T> getOutputCollector(AbstractInvokable task, TaskConfig config, ClassLoader cl, List<RecordWriter<?>> eventualOutputs, int outputOffset, int numOutputs) throws Exception Creates theCollectorfor the given task, as described by the given configuration. The output collector contains the writers that forward the data to the different tasks that the given task is connected to. Each writer applies the partitioning as described in the configuration.- Parameters:
task- The task that the output collector is created for.config- The configuration describing the output shipping strategies.cl- The classloader used to load user defined types.eventualOutputs- The output writers that this task forwards to the next task for each output.outputOffset- The offset to start to get the writers for the outputsnumOutputs- The number of outputs described in the configuration.- Returns:
- The OutputCollector that data produced in this task is submitted to.
- Throws:
Exception
-
initOutputs
public static <T> org.apache.flink.util.Collector<T> initOutputs(AbstractInvokable containingTask, org.apache.flink.util.UserCodeClassLoader cl, TaskConfig config, List<ChainedDriver<?, ?>> chainedTasksTarget, List<RecordWriter<?>> eventualOutputs, org.apache.flink.api.common.ExecutionConfig executionConfig, Map<String, throws Exceptionorg.apache.flink.api.common.accumulators.Accumulator<?, ?>> accumulatorMap) Creates a writer for each output. Creates an OutputCollector which forwards its input to all writers. The output collector applies the configured shipping strategy.- Throws:
Exception
-
openUserCode
public static void openUserCode(org.apache.flink.api.common.functions.Function stub, org.apache.flink.configuration.Configuration parameters) throws Exception Opens the given stub using itsRichFunction.open(OpenContext)method. If the open call produces an exception, a new exception with a standard error message is created, using the encountered exception as its cause.- Parameters:
stub- The user code instance to be opened.parameters- The parameters supplied to the user code.- Throws:
Exception- Thrown, if the user code's open method produces an exception.
-
closeUserCode
public static void closeUserCode(org.apache.flink.api.common.functions.Function stub) throws Exception Closes the given stub using itsRichFunction.close()method. If the close call produces an exception, a new exception with a standard error message is created, using the encountered exception as its cause.- Parameters:
stub- The user code instance to be closed.- Throws:
Exception- Thrown, if the user code's close method produces an exception.
-
openChainedTasks
public static void openChainedTasks(List<ChainedDriver<?, ?>> tasks, AbstractInvokable parent) throws ExceptionOpens all chained tasks, in the order as they are stored in the array. The opening process creates a standardized log info message.- Parameters:
tasks- The tasks to be opened.parent- The parent task, used to obtain parameters to include in the log message.- Throws:
Exception- Thrown, if the opening encounters an exception.
-
closeChainedTasks
public static void closeChainedTasks(List<ChainedDriver<?, ?>> tasks, AbstractInvokable parent) throws ExceptionCloses all chained tasks, in the order as they are stored in the array. The closing process creates a standardized log info message.- Parameters:
tasks- The tasks to be closed.parent- The parent task, used to obtain parameters to include in the log message.- Throws:
Exception- Thrown, if the closing encounters an exception.
-
cancelChainedTasks
Cancels all tasks via theirChainedDriver.cancelTask()method. Any occurring exception and error is suppressed, such that the canceling method of every task is invoked in all cases.- Parameters:
tasks- The tasks to be canceled.
-
instantiateUserCode
public static <T> T instantiateUserCode(TaskConfig config, ClassLoader cl, Class<? super T> superClass) Instantiates a user code class from is definition in the task configuration. The class is instantiated without arguments using the null-ary constructor. Instantiation will fail if this constructor does not exist or is not public.- Type Parameters:
T- The generic type of the user code class.- Parameters:
config- The task configuration containing the class description.cl- The class loader to be used to load the class.superClass- The super class that the user code class extends or implements, for type checking.- Returns:
- An instance of the user code class.
-
clearWriters
-
clearReaders
-