Class ActiveResourceManager<WorkerType extends ResourceIDRetrievable>
java.lang.Object
org.apache.flink.runtime.rpc.RpcEndpoint
org.apache.flink.runtime.rpc.FencedRpcEndpoint<ResourceManagerId>
org.apache.flink.runtime.resourcemanager.ResourceManager<WorkerType>
org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager<WorkerType>
- All Implemented Interfaces:
AutoCloseable,BlocklistListener,ClusterPartitionManager,ResourceEventHandler<WorkerType>,ResourceManagerGateway,org.apache.flink.runtime.rpc.FencedRpcGateway<ResourceManagerId>,org.apache.flink.runtime.rpc.RpcGateway,DelegationTokenManager.Listener,org.apache.flink.util.AutoCloseableAsync
public class ActiveResourceManager<WorkerType extends ResourceIDRetrievable>
extends ResourceManager<WorkerType>
implements ResourceEventHandler<WorkerType>
An active implementation of
ResourceManager.
This resource manager actively requests and releases resources from/to the external resource
management frameworks. With different ResourceManagerDriver provided, this resource
manager can work with various frameworks.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.flink.runtime.rpc.RpcEndpoint
org.apache.flink.runtime.rpc.RpcEndpoint.MainThreadExecutor -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final org.apache.flink.configuration.ConfigurationFields inherited from class org.apache.flink.runtime.resourcemanager.ResourceManager
blocklistHandler, ioExecutor, RESOURCE_MANAGER_NAME, resourceManagerMetricGroupFields inherited from class org.apache.flink.runtime.rpc.RpcEndpoint
log, rpcServer -
Constructor Summary
ConstructorsConstructorDescriptionActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver, org.apache.flink.configuration.Configuration flinkConfig, org.apache.flink.runtime.rpc.RpcService rpcService, UUID leaderSessionId, ResourceID resourceId, HeartbeatServices heartbeatServices, DelegationTokenManager delegationTokenManager, SlotManager slotManager, ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory, BlocklistHandler.Factory blocklistHandlerFactory, JobLeaderIdService jobLeaderIdService, ClusterInformation clusterInformation, org.apache.flink.runtime.rpc.FatalErrorHandler fatalErrorHandler, ResourceManagerMetricGroup resourceManagerMetricGroup, ThresholdMeter startWorkerFailureRater, Duration retryInterval, Duration workerRegistrationTimeout, Duration previousWorkerRecoverTimeout, Executor ioExecutor) -
Method Summary
Modifier and TypeMethodDescriptionvoiddeclareResourceNeeded(Collection<ResourceDeclaration> resourceDeclarations) Get the ready to serve future of the resource manager.protected ResourceAllocatorprotected Optional<WorkerType>getWorkerNodeIfAcceptRegistration(ResourceID resourceID) Get worker node if the worker resource is accepted.protected voidInitializes the framework specific components.protected voidinternalDeregisterApplication(ApplicationStatus finalStatus, String optionalDiagnostics) The framework specific code to deregister the application.voidNotifies that an error has occurred that the process cannot proceed.voidonPreviousAttemptWorkersRecovered(Collection<WorkerType> recoveredWorkers) Notifies that workers of previous attempt have been recovered from the external resource manager.protected voidonWorkerRegistered(WorkerType worker, WorkerResourceSpec workerResourceSpec) voidonWorkerTerminated(ResourceID resourceId, String diagnostics) Notifies that the worker has been terminated.protected voidvoidrequestNewWorker(WorkerResourceSpec workerResourceSpec) Allocates a resource using the worker resource specification.protected voidTerminates the framework specific components.Methods inherited from class org.apache.flink.runtime.resourcemanager.ResourceManager
closeJobManagerConnection, closeTaskManagerConnection, declareRequiredResources, deregisterApplication, disconnectJobManager, disconnectTaskManager, getClusterPartitionsShuffleDescriptors, getInstanceIdByResourceId, getNumberOfRegisteredTaskManagers, getStartedFuture, getWorkerByInstanceId, heartbeatFromJobManager, heartbeatFromTaskManager, jobLeaderLostLeadership, listDataSets, notifyNewBlockedNodes, notifySlotAvailable, onFatalError, onNewTokensObtained, onStart, onStop, registerJobMaster, registerTaskExecutor, releaseClusterPartitions, removeJob, reportClusterPartitions, requestProfiling, requestResourceOverview, requestTaskExecutorThreadInfoGateway, requestTaskManagerDetailsInfo, requestTaskManagerFileUploadByNameAndType, requestTaskManagerFileUploadByType, requestTaskManagerInfo, requestTaskManagerLogList, requestTaskManagerMetricQueryServiceAddresses, requestTaskManagerProfilingList, requestThreadDump, sendSlotReport, setFailUnfulfillableRequest, stopWorkerIfSupportedMethods inherited from class org.apache.flink.runtime.rpc.FencedRpcEndpoint
getFencingTokenMethods inherited from class org.apache.flink.runtime.rpc.RpcEndpoint
callAsync, closeAsync, getAddress, getEndpointId, getHostname, getMainThreadExecutor, getMainThreadExecutor, getRpcService, getSelfGateway, getTerminationFuture, internalCallOnStart, internalCallOnStop, isRunning, registerResource, runAsync, scheduleRunAsync, scheduleRunAsync, start, stop, unregisterResource, validateRunsInMainThreadMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.util.AutoCloseableAsync
closeMethods inherited from interface org.apache.flink.runtime.rpc.FencedRpcGateway
getFencingTokenMethods inherited from interface org.apache.flink.runtime.rpc.RpcGateway
getAddress, getHostname
-
Field Details
-
flinkConfig
protected final org.apache.flink.configuration.Configuration flinkConfig
-
-
Constructor Details
-
ActiveResourceManager
public ActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver, org.apache.flink.configuration.Configuration flinkConfig, org.apache.flink.runtime.rpc.RpcService rpcService, UUID leaderSessionId, ResourceID resourceId, HeartbeatServices heartbeatServices, DelegationTokenManager delegationTokenManager, SlotManager slotManager, ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory, BlocklistHandler.Factory blocklistHandlerFactory, JobLeaderIdService jobLeaderIdService, ClusterInformation clusterInformation, org.apache.flink.runtime.rpc.FatalErrorHandler fatalErrorHandler, ResourceManagerMetricGroup resourceManagerMetricGroup, ThresholdMeter startWorkerFailureRater, Duration retryInterval, Duration workerRegistrationTimeout, Duration previousWorkerRecoverTimeout, Executor ioExecutor)
-
-
Method Details
-
initialize
Description copied from class:ResourceManagerInitializes the framework specific components.- Specified by:
initializein classResourceManager<WorkerType extends ResourceIDRetrievable>- Throws:
ResourceManagerException- which occurs during initialization and causes the resource manager to fail.
-
terminate
Description copied from class:ResourceManagerTerminates the framework specific components.- Specified by:
terminatein classResourceManager<WorkerType extends ResourceIDRetrievable>- Throws:
ResourceManagerException
-
internalDeregisterApplication
protected void internalDeregisterApplication(ApplicationStatus finalStatus, @Nullable String optionalDiagnostics) throws ResourceManagerException Description copied from class:ResourceManagerThe framework specific code to deregister the application. This should report the application's final status and shut down the resource manager cleanly.This method also needs to make sure all pending containers that are not registered yet are returned.
- Specified by:
internalDeregisterApplicationin classResourceManager<WorkerType extends ResourceIDRetrievable>- Parameters:
finalStatus- The application status to report.optionalDiagnostics- A diagnostics message ornull.- Throws:
ResourceManagerException- if the application could not be shut down.
-
getWorkerNodeIfAcceptRegistration
Description copied from class:ResourceManagerGet worker node if the worker resource is accepted.- Specified by:
getWorkerNodeIfAcceptRegistrationin classResourceManager<WorkerType extends ResourceIDRetrievable>- Parameters:
resourceID- The worker resource id
-
declareResourceNeeded
@VisibleForTesting public void declareResourceNeeded(Collection<ResourceDeclaration> resourceDeclarations) -
onWorkerRegistered
- Overrides:
onWorkerRegisteredin classResourceManager<WorkerType extends ResourceIDRetrievable>
-
registerMetrics
protected void registerMetrics()- Overrides:
registerMetricsin classResourceManager<WorkerType extends ResourceIDRetrievable>
-
onPreviousAttemptWorkersRecovered
Description copied from interface:ResourceEventHandlerNotifies that workers of previous attempt have been recovered from the external resource manager.- Specified by:
onPreviousAttemptWorkersRecoveredin interfaceResourceEventHandler<WorkerType extends ResourceIDRetrievable>- Parameters:
recoveredWorkers- Collection of worker nodes, in the deployment specific type.
-
onWorkerTerminated
Description copied from interface:ResourceEventHandlerNotifies that the worker has been terminated.- Specified by:
onWorkerTerminatedin interfaceResourceEventHandler<WorkerType extends ResourceIDRetrievable>- Parameters:
resourceId- Identifier of the terminated worker.diagnostics- Diagnostic message about the worker termination.
-
onError
Description copied from interface:ResourceEventHandlerNotifies that an error has occurred that the process cannot proceed.- Specified by:
onErrorin interfaceResourceEventHandler<WorkerType extends ResourceIDRetrievable>- Parameters:
exception- Exception that describes the error.
-
requestNewWorker
Allocates a resource using the worker resource specification.- Parameters:
workerResourceSpec- workerResourceSpec specifies the size of the to be allocated resource
-
getReadyToServeFuture
Description copied from class:ResourceManagerGet the ready to serve future of the resource manager.- Specified by:
getReadyToServeFuturein classResourceManager<WorkerType extends ResourceIDRetrievable>- Returns:
- The ready to serve future of the resource manager, which indicated whether it is ready to serve.
-
getResourceAllocator
- Specified by:
getResourceAllocatorin classResourceManager<WorkerType extends ResourceIDRetrievable>
-