Class JobGraph

java.lang.Object
org.apache.flink.runtime.jobgraph.JobGraph
All Implemented Interfaces:
Serializable, ExecutionPlan

public class JobGraph extends Object implements ExecutionPlan
The JobGraph represents a Flink dataflow program, at the low level that the JobManager accepts. All programs from higher level APIs are transformed into JobGraphs.

The JobGraph is a graph of vertices and intermediate results that are connected together to form a DAG. Note that iterations (feedback edges) are currently not encoded inside the JobGraph but inside certain special vertices that establish the feedback channel amongst themselves.

The JobGraph defines the job-wide configuration settings, while each vertex and intermediate result define the characteristics of the concrete operation and intermediate data.

See Also:
  • Constructor Details

    • JobGraph

      public JobGraph(String jobName)
      Constructs a new job graph with the given name, the given ExecutionConfig, and a random job ID. The ExecutionConfig will be serialized and can't be modified afterwards.
      Parameters:
      jobName - The name of the job.
    • JobGraph

      public JobGraph(@Nullable org.apache.flink.api.common.JobID jobId, String jobName)
      Constructs a new job graph with the given job ID (or a random ID, if null is passed), the given name and the given execution configuration (see ExecutionConfig). The ExecutionConfig will be serialized and can't be modified afterwards.
      Parameters:
      jobId - The id of the job. A random ID is generated, if null is passed.
      jobName - The name of the job.
    • JobGraph

      public JobGraph(@Nullable org.apache.flink.api.common.JobID jobId, String jobName, JobVertex... vertices)
      Constructs a new job graph with the given name, the given ExecutionConfig, the given jobId or a random one if null supplied, and the given job vertices. The ExecutionConfig will be serialized and can't be modified afterwards.
      Parameters:
      jobId - The id of the job. A random ID is generated, if null is passed.
      jobName - The name of the job.
      vertices - The vertices to add to the graph.
  • Method Details

    • getJobID

      public org.apache.flink.api.common.JobID getJobID()
      Returns the ID of the job.
      Specified by:
      getJobID in interface ExecutionPlan
      Returns:
      the ID of the job
    • setJobID

      public void setJobID(org.apache.flink.api.common.JobID jobID)
      Sets the ID of the job.
    • getName

      public String getName()
      Returns the name assigned to the job graph.
      Specified by:
      getName in interface ExecutionPlan
      Returns:
      the name assigned to the job graph
    • isPartialResourceConfigured

      public boolean isPartialResourceConfigured()
      Description copied from interface: ExecutionPlan
      Checks if partial resource configuration is specified.
      Specified by:
      isPartialResourceConfigured in interface ExecutionPlan
      Returns:
      true if partial resource configuration is set; false otherwise
    • isEmpty

      public boolean isEmpty()
      Description copied from interface: ExecutionPlan
      Checks if the execution plan is empty.
      Specified by:
      isEmpty in interface ExecutionPlan
      Returns:
      true if the plan is empty; false otherwise
    • setJobConfiguration

      public void setJobConfiguration(org.apache.flink.configuration.Configuration jobConfiguration)
    • getJobConfiguration

      public org.apache.flink.configuration.Configuration getJobConfiguration()
      Returns the configuration object for this job. Job-wide parameters should be set into that configuration object.
      Specified by:
      getJobConfiguration in interface ExecutionPlan
      Returns:
      The configuration object for this job.
    • getSerializedExecutionConfig

      public org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> getSerializedExecutionConfig()
      Returns the ExecutionConfig.
      Specified by:
      getSerializedExecutionConfig in interface ExecutionPlan
      Returns:
      ExecutionConfig
    • setJobType

      public void setJobType(JobType type)
    • getJobType

      public JobType getJobType()
      Description copied from interface: ExecutionPlan
      Gets the type of the job.
      Specified by:
      getJobType in interface ExecutionPlan
      Returns:
      the job type
    • setDynamic

      public void setDynamic(boolean dynamic)
    • isDynamic

      public boolean isDynamic()
      Description copied from interface: ExecutionPlan
      Checks if the execution plan is dynamic.
      Specified by:
      isDynamic in interface ExecutionPlan
      Returns:
      true if the execution plan is dynamic; false otherwise
    • enableApproximateLocalRecovery

      public void enableApproximateLocalRecovery(boolean enabled)
    • isApproximateLocalRecoveryEnabled

      public boolean isApproximateLocalRecoveryEnabled()
    • setSavepointRestoreSettings

      public void setSavepointRestoreSettings(SavepointRestoreSettings settings)
      Sets the savepoint restore settings.
      Specified by:
      setSavepointRestoreSettings in interface ExecutionPlan
      Parameters:
      settings - The savepoint restore settings.
    • getSavepointRestoreSettings

      public SavepointRestoreSettings getSavepointRestoreSettings()
      Returns the configured savepoint restore setting.
      Specified by:
      getSavepointRestoreSettings in interface ExecutionPlan
      Returns:
      The configured savepoint restore settings.
    • setExecutionConfig

      public void setExecutionConfig(org.apache.flink.api.common.ExecutionConfig executionConfig) throws IOException
      Sets the execution config. This method eagerly serialized the ExecutionConfig for future RPC transport. Further modification of the referenced ExecutionConfig object will not affect this serialized copy.
      Parameters:
      executionConfig - The ExecutionConfig to be serialized.
      Throws:
      IOException - Thrown if the serialization of the ExecutionConfig fails
    • setSerializedExecutionConfig

      public void setSerializedExecutionConfig(org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> serializedExecutionConfig)
    • addVertex

      public void addVertex(JobVertex vertex)
      Adds a new task vertex to the job graph if it is not already included.
      Parameters:
      vertex - the new task vertex to be added
    • getVertices

      public Iterable<JobVertex> getVertices()
      Returns an Iterable to iterate all vertices registered with the job graph.
      Returns:
      an Iterable to iterate all vertices registered with the job graph
    • getVerticesAsArray

      public JobVertex[] getVerticesAsArray()
      Returns an array of all job vertices that are registered with the job graph. The order in which the vertices appear in the list is not defined.
      Returns:
      an array of all job vertices that are registered with the job graph
    • getNumberOfVertices

      public int getNumberOfVertices()
      Returns the number of all vertices.
      Returns:
      The number of all vertices.
    • getSlotSharingGroups

      public Set<SlotSharingGroup> getSlotSharingGroups()
    • getCoLocationGroups

      public Set<CoLocationGroup> getCoLocationGroups()
      Returns all CoLocationGroup instances associated with this JobGraph.
      Returns:
      The associated CoLocationGroup instances.
    • setSnapshotSettings

      public void setSnapshotSettings(JobCheckpointingSettings settings)
      Sets the settings for asynchronous snapshots. A value of null means that snapshotting is not enabled.
      Parameters:
      settings - The snapshot settings
    • getCheckpointingSettings

      public JobCheckpointingSettings getCheckpointingSettings()
      Gets the settings for asynchronous snapshots. This method returns null, when checkpointing is not enabled.
      Specified by:
      getCheckpointingSettings in interface ExecutionPlan
      Returns:
      The snapshot settings
    • findVertexByID

      public JobVertex findVertexByID(JobVertexID id)
      Searches for a vertex with a matching ID and returns it.
      Parameters:
      id - the ID of the vertex to search for
      Returns:
      the vertex with the matching ID or null if no vertex with such ID could be found
    • setClasspaths

      public void setClasspaths(List<URL> paths)
      Sets the classpaths required to run the job on a task manager.
      Parameters:
      paths - paths of the directories/JAR files required to run the job on a task manager
    • getClasspaths

      public List<URL> getClasspaths()
      Description copied from interface: ExecutionPlan
      Gets the classpath required for the job.
      Specified by:
      getClasspaths in interface ExecutionPlan
      Returns:
      a list of classpath URLs
    • getMaximumParallelism

      public int getMaximumParallelism()
      Gets the maximum parallelism of all operations in this job graph.
      Specified by:
      getMaximumParallelism in interface ExecutionPlan
      Returns:
      The maximum parallelism of this job graph
    • getVerticesSortedTopologicallyFromSources

      public List<JobVertex> getVerticesSortedTopologicallyFromSources() throws org.apache.flink.api.common.InvalidProgramException
      Throws:
      org.apache.flink.api.common.InvalidProgramException
    • addJar

      public void addJar(org.apache.flink.core.fs.Path jar)
      Adds the path of a JAR file required to run the job on a task manager.
      Parameters:
      jar - path of the JAR file required to run the job on a task manager
    • addJars

      public void addJars(List<URL> jarFilesToAttach)
      Adds the given jar files to the JobGraph via addJar(org.apache.flink.core.fs.Path).
      Parameters:
      jarFilesToAttach - a list of the URLs of the jar files to attach to the jobgraph.
      Throws:
      RuntimeException - if a jar URL is not valid.
    • getUserJars

      public List<org.apache.flink.core.fs.Path> getUserJars()
      Gets the list of assigned user jar paths.
      Specified by:
      getUserJars in interface ExecutionPlan
      Returns:
      The list of assigned user jar paths
    • addUserArtifact

      public void addUserArtifact(String name, org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry file)
      Adds the path of a custom file required to run the job on a task manager.
      Parameters:
      name - a name under which this artifact will be accessible through DistributedCache
      file - path of a custom file required to run the job on a task manager
    • getUserArtifacts

      public Map<String,org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry> getUserArtifacts()
      Gets the list of assigned user jar paths.
      Specified by:
      getUserArtifacts in interface ExecutionPlan
      Returns:
      The list of assigned user jar paths
    • addUserJarBlobKey

      public void addUserJarBlobKey(PermanentBlobKey key)
      Adds the BLOB referenced by the key to the JobGraph's dependencies.
      Specified by:
      addUserJarBlobKey in interface ExecutionPlan
      Parameters:
      key - path of the JAR file required to run the job on a task manager
    • hasUsercodeJarFiles

      public boolean hasUsercodeJarFiles()
      Checks whether the JobGraph has user code JAR files attached.
      Returns:
      True, if the JobGraph has user code JAR files attached, false otherwise.
    • getUserJarBlobKeys

      public List<PermanentBlobKey> getUserJarBlobKeys()
      Returns a set of BLOB keys referring to the JAR files required to run this job.
      Specified by:
      getUserJarBlobKeys in interface ExecutionPlan
      Returns:
      set of BLOB keys referring to the JAR files required to run this job
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • setUserArtifactBlobKey

      public void setUserArtifactBlobKey(String entryName, PermanentBlobKey blobKey) throws IOException
      Description copied from interface: ExecutionPlan
      Sets a user artifact blob key for a specified user artifact.
      Specified by:
      setUserArtifactBlobKey in interface ExecutionPlan
      Parameters:
      entryName - the name of the user artifact
      blobKey - the blob key corresponding to the user artifact
      Throws:
      IOException - if an error occurs during the operation
    • setUserArtifactRemotePath

      public void setUserArtifactRemotePath(String entryName, String remotePath)
    • writeUserArtifactEntriesToConfiguration

      public void writeUserArtifactEntriesToConfiguration()
      Description copied from interface: ExecutionPlan
      Writes user artifact entries to the job configuration.
      Specified by:
      writeUserArtifactEntriesToConfiguration in interface ExecutionPlan
    • setJobStatusHooks

      public void setJobStatusHooks(List<org.apache.flink.core.execution.JobStatusHook> hooks)
    • getJobStatusHooks

      public List<org.apache.flink.core.execution.JobStatusHook> getJobStatusHooks()
    • setInitialClientHeartbeatTimeout

      public void setInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout)
    • getInitialClientHeartbeatTimeout

      public long getInitialClientHeartbeatTimeout()
      Description copied from interface: ExecutionPlan
      Gets the initial client heartbeat timeout.
      Specified by:
      getInitialClientHeartbeatTimeout in interface ExecutionPlan
      Returns:
      the timeout duration in milliseconds