Class AbstractHaServices
- All Implemented Interfaces:
AutoCloseable,GloballyCleanableResource,ClientHighAvailabilityServices,HighAvailabilityServices
- Direct Known Subclasses:
ZooKeeperLeaderElectionHaServices
getLeaderPathForResourceManager(), getLeaderPathForDispatcher(), getLeaderPathForJobManager(org.apache.flink.api.common.JobID), getLeaderPathForRestServer(). The returned leader name is the ConfigMap name in Kubernetes and
child path in Zookeeper.
close() and cleanupAllData() should be implemented to destroy the resources.
The abstract class is also responsible for determining which component service should be
reused. For example, jobResultStore is created once and could be reused many times.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final org.apache.flink.configuration.ConfigurationThe runtime configuration.protected final ExecutorThe executor to run external IO operations on.protected final org.slf4j.LoggerFields inherited from interface org.apache.flink.runtime.highavailability.HighAvailabilityServices
DEFAULT_JOB_ID, DEFAULT_LEADER_ID -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedAbstractHaServices(org.apache.flink.configuration.Configuration config, LeaderElectionDriverFactory driverFactory, Executor ioExecutor, BlobStoreService blobStoreService, JobResultStore jobResultStore) -
Method Summary
Modifier and TypeMethodDescriptionvoidDeletes all data stored by high availability services in external stores.voidclose()Closes the high availability services, releasing all resources.Creates the BLOB store in which BLOBs are stored in a highly-available fashion.protected abstract CheckpointRecoveryFactoryCreate the checkpoint recovery factory for the job manager.protected abstract ExecutionPlanStoreCreate the submitted execution plan store for the job manager.protected abstract LeaderRetrievalServicecreateLeaderRetrievalService(String leaderName) Create leader retrieval service with specified leaderName.Gets the checkpoint recovery factory for the job manager.Gets theLeaderElectionfor the cluster's rest endpoint.Get the leader retriever for the cluster's rest endpoint.Gets theLeaderElectionfor the cluster's dispatcher.Gets the leader retriever for the dispatcher.Gets the submitted execution plan store for the job manager.getJobManagerLeaderElection(org.apache.flink.api.common.JobID jobID) Gets theLeaderElectionfor the job with the givenJobID.getJobManagerLeaderRetriever(org.apache.flink.api.common.JobID jobID) Gets the leader retriever for the job JobMaster which is responsible for the given job.getJobManagerLeaderRetriever(org.apache.flink.api.common.JobID jobID, String defaultJobManagerAddress) Gets the leader retriever for the job JobMaster which is responsible for the given job.Gets the store that holds information about the state of finished jobs.protected abstract StringGet the leader path for Dispatcher.protected abstract StringgetLeaderPathForJobManager(org.apache.flink.api.common.JobID jobID) Get the leader path for specific JobManager.protected abstract StringGet the leader path for ResourceManager.protected abstract StringGet the leader path for RestServer.Gets theLeaderElectionfor the cluster's resource manager.Gets the leader retriever for the cluster's resource manager.globalCleanupAsync(org.apache.flink.api.common.JobID jobID, Executor executor) globalCleanupAsyncis expected to be called from the main thread.protected abstract voidClean up the meta data in the distributed system(e.g.protected abstract voidinternalCleanupJobData(org.apache.flink.api.common.JobID jobID) Clean up the meta data in the distributed system(e.g.protected abstract voidCloses the components which is used for external operations(e.g.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.runtime.highavailability.HighAvailabilityServices
closeWithOptionalClean, getWebMonitorLeaderElection, getWebMonitorLeaderRetriever
-
Field Details
-
logger
protected final org.slf4j.Logger logger -
ioExecutor
The executor to run external IO operations on. -
configuration
protected final org.apache.flink.configuration.Configuration configurationThe runtime configuration.
-
-
Constructor Details
-
AbstractHaServices
protected AbstractHaServices(org.apache.flink.configuration.Configuration config, LeaderElectionDriverFactory driverFactory, Executor ioExecutor, BlobStoreService blobStoreService, JobResultStore jobResultStore)
-
-
Method Details
-
getResourceManagerLeaderRetriever
Description copied from interface:HighAvailabilityServicesGets the leader retriever for the cluster's resource manager.- Specified by:
getResourceManagerLeaderRetrieverin interfaceHighAvailabilityServices
-
getDispatcherLeaderRetriever
Description copied from interface:HighAvailabilityServicesGets the leader retriever for the dispatcher. This leader retrieval service is not always accessible.- Specified by:
getDispatcherLeaderRetrieverin interfaceHighAvailabilityServices
-
getJobManagerLeaderRetriever
Description copied from interface:HighAvailabilityServicesGets the leader retriever for the job JobMaster which is responsible for the given job.- Specified by:
getJobManagerLeaderRetrieverin interfaceHighAvailabilityServices- Parameters:
jobID- The identifier of the job.- Returns:
- Leader retrieval service to retrieve the job manager for the given job
-
getJobManagerLeaderRetriever
public LeaderRetrievalService getJobManagerLeaderRetriever(org.apache.flink.api.common.JobID jobID, String defaultJobManagerAddress) Description copied from interface:HighAvailabilityServicesGets the leader retriever for the job JobMaster which is responsible for the given job.- Specified by:
getJobManagerLeaderRetrieverin interfaceHighAvailabilityServices- Parameters:
jobID- The identifier of the job.defaultJobManagerAddress- JobManager address which will be returned by a static leader retrieval service.- Returns:
- Leader retrieval service to retrieve the job manager for the given job
-
getClusterRestEndpointLeaderRetriever
Description copied from interface:ClientHighAvailabilityServicesGet the leader retriever for the cluster's rest endpoint.- Specified by:
getClusterRestEndpointLeaderRetrieverin interfaceClientHighAvailabilityServices- Specified by:
getClusterRestEndpointLeaderRetrieverin interfaceHighAvailabilityServices- Returns:
- the leader retriever for cluster's rest endpoint.
-
getResourceManagerLeaderElection
Description copied from interface:HighAvailabilityServicesGets theLeaderElectionfor the cluster's resource manager.- Specified by:
getResourceManagerLeaderElectionin interfaceHighAvailabilityServices
-
getDispatcherLeaderElection
Description copied from interface:HighAvailabilityServicesGets theLeaderElectionfor the cluster's dispatcher.- Specified by:
getDispatcherLeaderElectionin interfaceHighAvailabilityServices
-
getJobManagerLeaderElection
Description copied from interface:HighAvailabilityServicesGets theLeaderElectionfor the job with the givenJobID.- Specified by:
getJobManagerLeaderElectionin interfaceHighAvailabilityServices
-
getClusterRestEndpointLeaderElection
Description copied from interface:HighAvailabilityServicesGets theLeaderElectionfor the cluster's rest endpoint.- Specified by:
getClusterRestEndpointLeaderElectionin interfaceHighAvailabilityServices
-
getCheckpointRecoveryFactory
Description copied from interface:HighAvailabilityServicesGets the checkpoint recovery factory for the job manager.- Specified by:
getCheckpointRecoveryFactoryin interfaceHighAvailabilityServices- Returns:
- Checkpoint recovery factory
- Throws:
Exception
-
getExecutionPlanStore
Description copied from interface:HighAvailabilityServicesGets the submitted execution plan store for the job manager.- Specified by:
getExecutionPlanStorein interfaceHighAvailabilityServices- Returns:
- Submitted execution plan store
- Throws:
Exception- if the submitted execution plan store could not be created
-
getJobResultStore
Description copied from interface:HighAvailabilityServicesGets the store that holds information about the state of finished jobs.- Specified by:
getJobResultStorein interfaceHighAvailabilityServices- Returns:
- Store of finished job results
- Throws:
Exception- if job result store could not be created
-
createBlobStore
Description copied from interface:HighAvailabilityServicesCreates the BLOB store in which BLOBs are stored in a highly-available fashion.- Specified by:
createBlobStorein interfaceHighAvailabilityServices- Returns:
- Blob store
-
close
Description copied from interface:HighAvailabilityServicesCloses the high availability services, releasing all resources.This method does not delete or clean up any data stored in external stores (file systems, ZooKeeper, etc). Another instance of the high availability services will be able to recover the job.
If an exception occurs during closing services, this method will attempt to continue closing other services and report exceptions only after all services have been attempted to be closed.
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceHighAvailabilityServices- Throws:
Exception- Thrown, if an exception occurred while closing these services.
-
cleanupAllData
Description copied from interface:HighAvailabilityServicesDeletes all data stored by high availability services in external stores.After this method was called, any job or session that was managed by these high availability services will be unrecoverable.
If an exception occurs during cleanup, this method will attempt to continue the cleanup and report exceptions only after all cleanup steps have been attempted.
- Specified by:
cleanupAllDatain interfaceHighAvailabilityServices- Throws:
Exception- if an error occurred while cleaning up data stored by them.
-
globalCleanupAsync
public CompletableFuture<Void> globalCleanupAsync(org.apache.flink.api.common.JobID jobID, Executor executor) Description copied from interface:GloballyCleanableResourceglobalCleanupAsyncis expected to be called from the main thread. Heavy IO tasks should be outsourced into the passedcleanupExecutor. Thread-safety must be ensured.- Specified by:
globalCleanupAsyncin interfaceGloballyCleanableResource- Specified by:
globalCleanupAsyncin interfaceHighAvailabilityServices- Parameters:
jobID- TheJobIDof the job for which the local data should be cleaned up.executor- The fallback executor for IO-heavy operations.- Returns:
- The cleanup result future.
-
createLeaderRetrievalService
Create leader retrieval service with specified leaderName.- Parameters:
leaderName- ConfigMap name in Kubernetes or child node path in Zookeeper.- Returns:
- Return LeaderRetrievalService using Zookeeper or Kubernetes.
-
createCheckpointRecoveryFactory
Create the checkpoint recovery factory for the job manager.- Returns:
- Checkpoint recovery factory
- Throws:
Exception
-
createExecutionPlanStore
Create the submitted execution plan store for the job manager.- Returns:
- Submitted execution plan store
- Throws:
Exception- if the submitted execution plan store could not be created
-
internalClose
Closes the components which is used for external operations(e.g. Zookeeper Client, Kubernetes Client).- Throws:
Exception- if the close operation failed
-
internalCleanup
Clean up the meta data in the distributed system(e.g. Zookeeper, Kubernetes ConfigMap).If an exception occurs during internal cleanup, we will continue the cleanup in
cleanupAllData()and report exceptions only after all cleanup steps have been attempted.- Throws:
Exception- when do the cleanup operation on external storage.
-
internalCleanupJobData
protected abstract void internalCleanupJobData(org.apache.flink.api.common.JobID jobID) throws Exception Clean up the meta data in the distributed system(e.g. Zookeeper, Kubernetes ConfigMap) for the specified Job. Method implementations need to be thread-safe.- Parameters:
jobID- The identifier of the job to cleanup.- Throws:
Exception- when do the cleanup operation on external storage.
-
getLeaderPathForResourceManager
Get the leader path for ResourceManager.- Returns:
- Return the ResourceManager leader name. It is ConfigMap name in Kubernetes or child node path in Zookeeper.
-
getLeaderPathForDispatcher
Get the leader path for Dispatcher.- Returns:
- Return the Dispatcher leader name. It is ConfigMap name in Kubernetes or child node path in Zookeeper.
-
getLeaderPathForJobManager
Get the leader path for specific JobManager.- Parameters:
jobID- job id- Returns:
- Return the JobManager leader name for specified job id. It is ConfigMap name in Kubernetes or child node path in Zookeeper.
-
getLeaderPathForRestServer
Get the leader path for RestServer.- Returns:
- Return the RestServer leader name. It is ConfigMap name in Kubernetes or child node path in Zookeeper.
-