Class AdaptiveBatchScheduler
java.lang.Object
org.apache.flink.runtime.scheduler.SchedulerBase
org.apache.flink.runtime.scheduler.DefaultScheduler
org.apache.flink.runtime.scheduler.adaptivebatch.AdaptiveBatchScheduler
- All Implemented Interfaces:
AutoCloseable,CheckpointScheduling,JobGraphUpdateListener,GlobalFailureHandler,SchedulerNG,SchedulerOperations,org.apache.flink.util.AutoCloseableAsync
This scheduler decides the parallelism of JobVertex according to the data volume it consumes. A
dynamically built up ExecutionGraph is used for this purpose.
-
Field Summary
Fields inherited from class org.apache.flink.runtime.scheduler.DefaultScheduler
executionDeployer, executionSlotAllocator, failoverStrategy, log, schedulingStrategy, shuffleMasterFields inherited from class org.apache.flink.runtime.scheduler.SchedulerBase
executionVertexVersioner, inputsLocationsRetriever, jobInfo, jobManagerJobMetricGroup, operatorCoordinatorHandler, stateLocationRetriever -
Constructor Summary
ConstructorsConstructorDescriptionAdaptiveBatchScheduler(org.slf4j.Logger log, AdaptiveExecutionHandler adaptiveExecutionHandler, Executor ioExecutor, org.apache.flink.configuration.Configuration jobMasterConfiguration, Consumer<org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor> startUpAction, org.apache.flink.util.concurrent.ScheduledExecutor delayExecutor, ClassLoader userCodeLoader, CheckpointsCleaner checkpointsCleaner, CheckpointRecoveryFactory checkpointRecoveryFactory, JobManagerJobMetricGroup jobManagerJobMetricGroup, SchedulingStrategyFactory schedulingStrategyFactory, FailoverStrategy.Factory failoverStrategyFactory, RestartBackoffTimeStrategy restartBackoffTimeStrategy, ExecutionOperations executionOperations, ExecutionVertexVersioner executionVertexVersioner, ExecutionSlotAllocatorFactory executionSlotAllocatorFactory, long initializationTimestamp, org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor mainThreadExecutor, JobStatusListener jobStatusListener, Collection<org.apache.flink.core.failure.FailureEnricher> failureEnrichers, ExecutionGraphFactory executionGraphFactory, ShuffleMaster<?> shuffleMaster, Duration rpcTimeout, VertexParallelismAndInputInfosDecider vertexParallelismAndInputInfosDecider, int defaultMaxParallelism, BlocklistOperations blocklistOperations, org.apache.flink.configuration.JobManagerOptions.HybridPartitionDataConsumeConstraint hybridPartitionDataConsumeConstraint, BatchJobRecoveryHandler jobRecoveryHandler, ExecutionPlanSchedulingContext executionPlanSchedulingContext) -
Method Summary
Modifier and TypeMethodDescriptionvoidallocateSlotsAndDeploy(List<ExecutionVertexID> verticesToDeploy) Allocate slots and deploy the vertex when slots are returned.static intcomputeMaxParallelism(int parallelism, int defaultMaxParallelism) static VertexParallelismStorecomputeVertexParallelismStoreForDynamicGraph(Iterable<JobVertex> vertices, int defaultMaxParallelism) Compute theVertexParallelismStorefor all given vertices in a dynamic graph, which will set defaults and ensure that the returned store contains valid parallelisms, with the configured default max parallelism.protected MarkPartitionFinishedStrategyprotected voidhandleTaskFailure(Execution failedExecution, Throwable error) voidprotected voidmaybeRestartTasks(FailureHandlingResult failureHandlingResult) Modifies the vertices which need to be restarted.voidonNewJobVerticesAdded(List<JobVertex> newVertices, int pendingOperatorsCount) Invoked when newJobVertexinstances are added to the JobGraph of a specific job.protected voidonTaskFailed(Execution execution) protected voidonTaskFinished(Execution execution, IOMetrics ioMetrics) protected voidresetForNewExecution(ExecutionVertexID executionVertexId) protected voidresetForNewExecutions(Collection<ExecutionVertexID> vertices) protected voidMethods inherited from class org.apache.flink.runtime.scheduler.DefaultScheduler
cancelAllPendingSlotRequestsForVertex, cancelAllPendingSlotRequestsInternal, cancelExecution, createFailureHandlingResultSnapshot, getNumberOfRescales, getNumberOfRestarts, getUserCodeLoader, handleGlobalFailure, notifyCoordinatorsAboutTaskFailure, recordTaskFailureMethods inherited from class org.apache.flink.runtime.scheduler.SchedulerBase
acknowledgeCheckpoint, archiveFromFailureHandlingResult, archiveGlobalFailure, cancel, computeVertexParallelismStore, computeVertexParallelismStore, computeVertexParallelismStore, computeVertexParallelismStore, declineCheckpoint, deliverCoordinationRequestToCoordinator, deliverOperatorEventToCoordinator, failJob, getDefaultMaxParallelism, getDefaultMaxParallelism, getExceptionHistory, getExecutionGraph, getExecutionJobVertex, getExecutionVertex, getJobGraph, getJobId, getJobTerminationFuture, getMainThreadExecutor, getResultPartitionAvailabilityChecker, getSchedulingTopology, notifyEndOfData, notifyKvStateRegistered, notifyKvStateUnregistered, registerJobMetrics, reportCheckpointMetrics, reportInitializationMetrics, requestCheckpointStats, requestJob, requestJobStatus, requestKvStateLocation, requestNextInputSplit, requestPartitionState, restoreState, setGlobalFailureCause, startCheckpointScheduler, startScheduling, stopCheckpointScheduler, stopWithSavepoint, transitionExecutionGraphState, transitionToRunning, transitionToScheduled, triggerCheckpoint, triggerSavepoint, updateAccumulators, updateTaskExecutionStateMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.util.AutoCloseableAsync
closeMethods inherited from interface org.apache.flink.runtime.scheduler.SchedulerNG
requestJobResourceRequirements, updateJobResourceRequirements, updateTaskExecutionState
-
Constructor Details
-
AdaptiveBatchScheduler
public AdaptiveBatchScheduler(org.slf4j.Logger log, AdaptiveExecutionHandler adaptiveExecutionHandler, Executor ioExecutor, org.apache.flink.configuration.Configuration jobMasterConfiguration, Consumer<org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor> startUpAction, org.apache.flink.util.concurrent.ScheduledExecutor delayExecutor, ClassLoader userCodeLoader, CheckpointsCleaner checkpointsCleaner, CheckpointRecoveryFactory checkpointRecoveryFactory, JobManagerJobMetricGroup jobManagerJobMetricGroup, SchedulingStrategyFactory schedulingStrategyFactory, FailoverStrategy.Factory failoverStrategyFactory, RestartBackoffTimeStrategy restartBackoffTimeStrategy, ExecutionOperations executionOperations, ExecutionVertexVersioner executionVertexVersioner, ExecutionSlotAllocatorFactory executionSlotAllocatorFactory, long initializationTimestamp, org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor mainThreadExecutor, JobStatusListener jobStatusListener, Collection<org.apache.flink.core.failure.FailureEnricher> failureEnrichers, ExecutionGraphFactory executionGraphFactory, ShuffleMaster<?> shuffleMaster, Duration rpcTimeout, VertexParallelismAndInputInfosDecider vertexParallelismAndInputInfosDecider, int defaultMaxParallelism, BlocklistOperations blocklistOperations, org.apache.flink.configuration.JobManagerOptions.HybridPartitionDataConsumeConstraint hybridPartitionDataConsumeConstraint, BatchJobRecoveryHandler jobRecoveryHandler, ExecutionPlanSchedulingContext executionPlanSchedulingContext) throws Exception - Throws:
Exception
-
-
Method Details
-
onNewJobVerticesAdded
public void onNewJobVerticesAdded(List<JobVertex> newVertices, int pendingOperatorsCount) throws Exception Description copied from interface:JobGraphUpdateListenerInvoked when newJobVertexinstances are added to the JobGraph of a specific job. This allows interested components to react to the addition of new vertices to the job topology.- Specified by:
onNewJobVerticesAddedin interfaceJobGraphUpdateListener- Parameters:
newVertices- A list of newly added JobVertex instances.pendingOperatorsCount- The number of pending operators.- Throws:
Exception
-
startSchedulingInternal
protected void startSchedulingInternal()- Overrides:
startSchedulingInternalin classDefaultScheduler
-
maybeRestartTasks
Modifies the vertices which need to be restarted. If any task needing restarting belongs to job vertices with unrecovered operator coordinators, all tasks within those job vertices need to be restarted once.- Overrides:
maybeRestartTasksin classDefaultScheduler
-
resetForNewExecutions
- Overrides:
resetForNewExecutionsin classSchedulerBase
-
closeAsync
- Specified by:
closeAsyncin interfaceorg.apache.flink.util.AutoCloseableAsync- Overrides:
closeAsyncin classSchedulerBase
-
onTaskFinished
- Overrides:
onTaskFinishedin classDefaultScheduler
-
onTaskFailed
- Overrides:
onTaskFailedin classDefaultScheduler
-
handleTaskFailure
- Overrides:
handleTaskFailurein classDefaultScheduler
-
allocateSlotsAndDeploy
Description copied from interface:SchedulerOperationsAllocate slots and deploy the vertex when slots are returned. Vertices will be deployed only after all of them have been assigned slots. The given order will be respected, i.e. tasks with smaller indices will be deployed earlier. Only vertices in CREATED state will be accepted. Errors will happen if scheduling Non-CREATED vertices.- Specified by:
allocateSlotsAndDeployin interfaceSchedulerOperations- Overrides:
allocateSlotsAndDeployin classDefaultScheduler- Parameters:
verticesToDeploy- The execution vertices to deploy
-
resetForNewExecution
- Overrides:
resetForNewExecutionin classSchedulerBase
-
getMarkPartitionFinishedStrategy
- Overrides:
getMarkPartitionFinishedStrategyin classSchedulerBase
-
computeDynamicSourceParallelism
-
initializeVerticesIfPossible
@VisibleForTesting public void initializeVerticesIfPossible() -
computeVertexParallelismStoreForDynamicGraph
@VisibleForTesting public static VertexParallelismStore computeVertexParallelismStoreForDynamicGraph(Iterable<JobVertex> vertices, int defaultMaxParallelism) Compute theVertexParallelismStorefor all given vertices in a dynamic graph, which will set defaults and ensure that the returned store contains valid parallelisms, with the configured default max parallelism.- Parameters:
vertices- the vertices to compute parallelism fordefaultMaxParallelism- the global default max parallelism- Returns:
- the computed parallelism store
-
computeMaxParallelism
public static int computeMaxParallelism(int parallelism, int defaultMaxParallelism)
-