Interface JobMasterGateway

All Superinterfaces:
BlocklistListener, CheckpointCoordinatorGateway, org.apache.flink.runtime.rpc.FencedRpcGateway<JobMasterId>, JobMasterOperatorEventGateway, KvStateLocationOracle, KvStateRegistryGateway, org.apache.flink.runtime.rpc.RpcGateway
All Known Implementing Classes:
JobMaster

public interface JobMasterGateway extends CheckpointCoordinatorGateway, org.apache.flink.runtime.rpc.FencedRpcGateway<JobMasterId>, KvStateLocationOracle, KvStateRegistryGateway, JobMasterOperatorEventGateway, BlocklistListener
JobMaster rpc gateway interface.
  • Method Details

    • cancel

      Cancels the currently executed job.
      Parameters:
      timeout - of this operation
      Returns:
      Future acknowledge of the operation
    • updateTaskExecutionState

      CompletableFuture<Acknowledge> updateTaskExecutionState(TaskExecutionState taskExecutionState)
      Updates the task execution state for a given task.
      Parameters:
      taskExecutionState - New task execution state for a given task
      Returns:
      Future flag of the task execution state update result
    • requestNextInputSplit

      CompletableFuture<SerializedInputSplit> requestNextInputSplit(JobVertexID vertexID, ExecutionAttemptID executionAttempt)
      Requests the next input split for the ExecutionJobVertex. The next input split is sent back to the sender as a SerializedInputSplit message.
      Parameters:
      vertexID - The job vertex id
      executionAttempt - The execution attempt id
      Returns:
      The future of the input split. If there is no further input split, will return an empty object.
    • requestPartitionState

      CompletableFuture<ExecutionState> requestPartitionState(IntermediateDataSetID intermediateResultId, ResultPartitionID partitionId)
      Requests the current state of the partition. The state of a partition is currently bound to the state of the producing execution.
      Parameters:
      intermediateResultId - The execution attempt ID of the task requesting the partition state.
      partitionId - The partition ID of the partition to request the state of.
      Returns:
      The future of the partition state
    • disconnectTaskManager

      CompletableFuture<Acknowledge> disconnectTaskManager(ResourceID resourceID, Exception cause)
      Disconnects the given TaskExecutor from the JobMaster.
      Parameters:
      resourceID - identifying the TaskManager to disconnect
      cause - for the disconnection of the TaskManager
      Returns:
      Future acknowledge once the JobMaster has been disconnected from the TaskManager
    • disconnectResourceManager

      void disconnectResourceManager(ResourceManagerId resourceManagerId, Exception cause)
      Disconnects the resource manager from the job manager because of the given cause.
      Parameters:
      resourceManagerId - identifying the resource manager leader id
      cause - of the disconnect
    • offerSlots

      CompletableFuture<Collection<SlotOffer>> offerSlots(ResourceID taskManagerId, Collection<SlotOffer> slots, Duration timeout)
      Offers the given slots to the job manager. The response contains the set of accepted slots.
      Parameters:
      taskManagerId - identifying the task manager
      slots - to offer to the job manager
      timeout - for the rpc call
      Returns:
      Future set of accepted slots.
    • failSlot

      void failSlot(ResourceID taskManagerId, AllocationID allocationId, Exception cause)
      Fails the slot with the given allocation id and cause.
      Parameters:
      taskManagerId - identifying the task manager
      allocationId - identifying the slot to fail
      cause - of the failing
    • registerTaskManager

      CompletableFuture<RegistrationResponse> registerTaskManager(org.apache.flink.api.common.JobID jobId, TaskManagerRegistrationInformation taskManagerRegistrationInformation, Duration timeout)
      Registers the task manager at the job manager.
      Parameters:
      jobId - jobId specifying the job for which the JobMaster should be responsible
      taskManagerRegistrationInformation - the information for registering a task manager at the job manager
      timeout - for the rpc call
      Returns:
      Future registration response indicating whether the registration was successful or not
    • heartbeatFromTaskManager

      CompletableFuture<Void> heartbeatFromTaskManager(ResourceID resourceID, TaskExecutorToJobManagerHeartbeatPayload payload)
      Sends the heartbeat to job manager from task manager.
      Parameters:
      resourceID - unique id of the task manager
      payload - report payload
      Returns:
      future which is completed exceptionally if the operation fails
    • heartbeatFromResourceManager

      CompletableFuture<Void> heartbeatFromResourceManager(ResourceID resourceID)
      Sends heartbeat request from the resource manager.
      Parameters:
      resourceID - unique id of the resource manager
      Returns:
      future which is completed exceptionally if the operation fails
    • requestJobStatus

      CompletableFuture<org.apache.flink.api.common.JobStatus> requestJobStatus(Duration timeout)
      Requests the current job status.
      Parameters:
      timeout - for the rpc call
      Returns:
      Future containing the current job status
    • requestJob

      Requests the ExecutionGraphInfo of the executed job.
      Parameters:
      timeout - for the rpc call
      Returns:
      Future which is completed with the ExecutionGraphInfo of the executed job
    • requestCheckpointStats

      CompletableFuture<CheckpointStatsSnapshot> requestCheckpointStats(Duration timeout)
      Requests the CheckpointStatsSnapshot of the job.
      Parameters:
      timeout - for the rpc call
      Returns:
      Future which is completed with the CheckpointStatsSnapshot of the job
    • triggerSavepoint

      CompletableFuture<String> triggerSavepoint(@Nullable String targetDirectory, boolean cancelJob, org.apache.flink.core.execution.SavepointFormatType formatType, Duration timeout)
      Triggers taking a savepoint of the executed job.
      Parameters:
      targetDirectory - to which to write the savepoint data or null if the default savepoint directory should be used
      formatType - binary format for the savepoint
      timeout - for the rpc call
      Returns:
      Future which is completed with the savepoint path once completed
    • triggerCheckpoint

      CompletableFuture<CompletedCheckpoint> triggerCheckpoint(org.apache.flink.core.execution.CheckpointType checkpointType, Duration timeout)
      Triggers taking a checkpoint of the executed job.
      Parameters:
      checkpointType - to determine how checkpoint should be taken
      timeout - for the rpc call
      Returns:
      Future which is completed with the CompletedCheckpoint once completed
    • triggerCheckpoint

      default CompletableFuture<String> triggerCheckpoint(Duration timeout)
      Triggers taking a checkpoint of the executed job.
      Parameters:
      timeout - for the rpc call
      Returns:
      Future which is completed with the checkpoint path once completed
    • stopWithSavepoint

      CompletableFuture<String> stopWithSavepoint(@Nullable String targetDirectory, org.apache.flink.core.execution.SavepointFormatType formatType, boolean terminate, Duration timeout)
      Stops the job with a savepoint.
      Parameters:
      targetDirectory - to which to write the savepoint data or null if the default savepoint directory should be used
      terminate - flag indicating if the job should terminate or just suspend
      timeout - for the rpc call
      Returns:
      Future which is completed with the savepoint path once completed
    • notifyNotEnoughResourcesAvailable

      void notifyNotEnoughResourcesAvailable(Collection<ResourceRequirement> acquiredResources)
      Notifies that not enough resources are available to fulfill the resource requirements of a job.
      Parameters:
      acquiredResources - the resources that have been acquired for the job
    • updateGlobalAggregate

      CompletableFuture<Object> updateGlobalAggregate(String aggregateName, Object aggregand, byte[] serializedAggregationFunction)
      Update the aggregate and return the new value.
      Parameters:
      aggregateName - The name of the aggregate to update
      aggregand - The value to add to the aggregate
      serializedAggregationFunction - The function to apply to the current aggregate and aggregand to obtain the new aggregate value, this should be of type AggregateFunction
      Returns:
      The updated aggregate
    • deliverCoordinationRequestToCoordinator

      CompletableFuture<CoordinationResponse> deliverCoordinationRequestToCoordinator(OperatorID operatorId, org.apache.flink.util.SerializedValue<CoordinationRequest> serializedRequest, Duration timeout)
      Deliver a coordination request to a specified coordinator and return the response.
      Parameters:
      operatorId - identifying the coordinator to receive the request
      serializedRequest - serialized request to deliver
      Returns:
      A future containing the response. The response will fail with a FlinkException if the task is not running, or no operator/coordinator exists for the given ID, or the coordinator cannot handle client events.
    • stopTrackingAndReleasePartitions

      CompletableFuture<?> stopTrackingAndReleasePartitions(Collection<ResultPartitionID> partitionIds)
      Notifies the JobMasterPartitionTracker to stop tracking the target result partitions and release the locally occupied resources on TaskExecutors if any.
    • getPartitionWithMetrics

      default CompletableFuture<Collection<PartitionWithMetrics>> getPartitionWithMetrics(Duration timeout, Set<ResultPartitionID> expectedPartitions)
      Get specified partitions and their metrics (identified by expectedPartitions), the metrics include sizes of sub-partitions in a result partition.
      Parameters:
      timeout - The timeout used for retrieve the specified partitions.
      expectedPartitions - The set of identifiers for the result partitions whose metrics are to be fetched.
      Returns:
      A future will contain a collection of the partitions with their metrics that could be retrieved from the expected partitions within the specified timeout period.
    • startFetchAndRetainPartitionWithMetricsOnTaskManager

      default void startFetchAndRetainPartitionWithMetricsOnTaskManager()
      Notify jobMaster to fetch and retain partitions on task managers. It will process for future TaskManager registrations and already registered TaskManagers.
    • requestJobResourceRequirements

      CompletableFuture<JobResourceRequirements> requestJobResourceRequirements()
      Returns:
      Future which that contains current resource requirements.
    • updateJobResourceRequirements

      CompletableFuture<Acknowledge> updateJobResourceRequirements(JobResourceRequirements jobResourceRequirements)
      Parameters:
      jobResourceRequirements - new resource requirements
      Returns:
      Future which is completed successfully when requirements are updated
    • notifyEndOfData

      void notifyEndOfData(ExecutionAttemptID executionAttempt)
      Notifies that the task has reached the end of data.
      Parameters:
      executionAttempt - The execution attempt id.