Class RestartPipelinedRegionFailoverStrategy

java.lang.Object
org.apache.flink.runtime.executiongraph.failover.RestartPipelinedRegionFailoverStrategy
All Implemented Interfaces:
FailoverStrategy

public class RestartPipelinedRegionFailoverStrategy extends Object implements FailoverStrategy
A failover strategy that proposes to restart involved regions when a vertex fails. A region is defined by this strategy as tasks that communicate via pipelined data exchange.
  • Constructor Details

    • RestartPipelinedRegionFailoverStrategy

      @VisibleForTesting public RestartPipelinedRegionFailoverStrategy(SchedulingTopology topology)
      Creates a new failover strategy to restart pipelined regions that works on the given topology. The result partitions are always considered to be available if no data consumption error happens.
      Parameters:
      topology - containing info about all the vertices and result partitions
    • RestartPipelinedRegionFailoverStrategy

      public RestartPipelinedRegionFailoverStrategy(SchedulingTopology topology, ResultPartitionAvailabilityChecker resultPartitionAvailabilityChecker)
      Creates a new failover strategy to restart pipelined regions that works on the given topology.
      Parameters:
      topology - containing info about all the vertices and result partitions
      resultPartitionAvailabilityChecker - helps to query result partition availability
  • Method Details

    • getTasksNeedingRestart

      public Set<ExecutionVertexID> getTasksNeedingRestart(ExecutionVertexID executionVertexId, Throwable cause)
      Returns a set of IDs corresponding to the set of vertices that should be restarted. In this strategy, all task vertices in 'involved' regions are proposed to be restarted. The 'involved' regions are calculated with rules below: 1. The region containing the failed task is always involved 2. If an input result partition of an involved region is not available, i.e. Missing or Corrupted, the region containing the partition producer task is involved 3. If a region is involved, all of its consumer regions are involved
      Specified by:
      getTasksNeedingRestart in interface FailoverStrategy
      Parameters:
      executionVertexId - ID of the failed task
      cause - cause of the failure
      Returns:
      set of IDs of vertices to restart
    • getFailoverRegion

      @VisibleForTesting public SchedulingPipelinedRegion getFailoverRegion(ExecutionVertexID vertexID)
      Returns the failover region that contains the given execution vertex.
      Returns:
      the failover region that contains the given execution vertex