Class JobGraph
java.lang.Object
org.apache.flink.runtime.jobgraph.JobGraph
- All Implemented Interfaces:
Serializable,ExecutionPlan
The JobGraph represents a Flink dataflow program, at the low level that the JobManager accepts.
All programs from higher level APIs are transformed into JobGraphs.
The JobGraph is a graph of vertices and intermediate results that are connected together to form a DAG. Note that iterations (feedback edges) are currently not encoded inside the JobGraph but inside certain special vertices that establish the feedback channel amongst themselves.
The JobGraph defines the job-wide configuration settings, while each vertex and intermediate result define the characteristics of the concrete operation and intermediate data.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionConstructs a new job graph with the given name, the givenExecutionConfig, and a random job ID.Constructs a new job graph with the given job ID (or a random ID, ifnullis passed), the given name and the given execution configuration (seeExecutionConfig).Constructs a new job graph with the given name, the givenExecutionConfig, the given jobId or a random one if null supplied, and the given job vertices. -
Method Summary
Modifier and TypeMethodDescriptionvoidaddJar(org.apache.flink.core.fs.Path jar) Adds the path of a JAR file required to run the job on a task manager.voidAdds the given jar files to theJobGraphviaaddJar(org.apache.flink.core.fs.Path).voidaddUserArtifact(String name, org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry file) Adds the path of a custom file required to run the job on a task manager.voidAdds the BLOB referenced by the key to the JobGraph's dependencies.voidAdds a new task vertex to the job graph if it is not already included.voidenableApproximateLocalRecovery(boolean enabled) Searches for a vertex with a matching ID and returns it.Gets the settings for asynchronous snapshots.Gets the classpath required for the job.Returns allCoLocationGroupinstances associated with thisJobGraph.longGets the initial client heartbeat timeout.org.apache.flink.configuration.ConfigurationReturns the configuration object for this job.org.apache.flink.api.common.JobIDgetJobID()Returns the ID of the job.List<org.apache.flink.core.execution.JobStatusHook>Gets the type of the job.intGets the maximum parallelism of all operations in this job graph.getName()Returns the name assigned to the job graph.intReturns the number of all vertices.Returns the configured savepoint restore setting.org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig>Returns theExecutionConfig.Gets the list of assigned user jar paths.Returns a set of BLOB keys referring to the JAR files required to run this job.List<org.apache.flink.core.fs.Path>Gets the list of assigned user jar paths.Returns an Iterable to iterate all vertices registered with the job graph.Returns an array of all job vertices that are registered with the job graph.booleanChecks whether the JobGraph has user code JAR files attached.booleanbooleanChecks if the execution plan is dynamic.booleanisEmpty()Checks if the execution plan is empty.booleanChecks if partial resource configuration is specified.voidsetClasspaths(List<URL> paths) Sets the classpaths required to run the job on a task manager.voidsetDynamic(boolean dynamic) voidsetExecutionConfig(org.apache.flink.api.common.ExecutionConfig executionConfig) Sets the execution config.voidsetInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout) voidsetJobConfiguration(org.apache.flink.configuration.Configuration jobConfiguration) voidsetJobID(org.apache.flink.api.common.JobID jobID) Sets the ID of the job.voidsetJobStatusHooks(List<org.apache.flink.core.execution.JobStatusHook> hooks) voidsetJobType(JobType type) voidSets the savepoint restore settings.voidsetSerializedExecutionConfig(org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> serializedExecutionConfig) voidsetSnapshotSettings(JobCheckpointingSettings settings) Sets the settings for asynchronous snapshots.voidsetUserArtifactBlobKey(String entryName, PermanentBlobKey blobKey) Sets a user artifact blob key for a specified user artifact.voidsetUserArtifactRemotePath(String entryName, String remotePath) toString()voidWrites user artifact entries to the job configuration.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface org.apache.flink.streaming.api.graph.ExecutionPlan
isCheckpointingEnabled
-
Constructor Details
-
JobGraph
Constructs a new job graph with the given name, the givenExecutionConfig, and a random job ID. The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobName- The name of the job.
-
JobGraph
Constructs a new job graph with the given job ID (or a random ID, ifnullis passed), the given name and the given execution configuration (seeExecutionConfig). The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobId- The id of the job. A random ID is generated, ifnullis passed.jobName- The name of the job.
-
JobGraph
public JobGraph(@Nullable org.apache.flink.api.common.JobID jobId, String jobName, JobVertex... vertices) Constructs a new job graph with the given name, the givenExecutionConfig, the given jobId or a random one if null supplied, and the given job vertices. The ExecutionConfig will be serialized and can't be modified afterwards.- Parameters:
jobId- The id of the job. A random ID is generated, ifnullis passed.jobName- The name of the job.vertices- The vertices to add to the graph.
-
-
Method Details
-
getJobID
public org.apache.flink.api.common.JobID getJobID()Returns the ID of the job.- Specified by:
getJobIDin interfaceExecutionPlan- Returns:
- the ID of the job
-
setJobID
public void setJobID(org.apache.flink.api.common.JobID jobID) Sets the ID of the job. -
getName
Returns the name assigned to the job graph.- Specified by:
getNamein interfaceExecutionPlan- Returns:
- the name assigned to the job graph
-
isPartialResourceConfigured
public boolean isPartialResourceConfigured()Description copied from interface:ExecutionPlanChecks if partial resource configuration is specified.- Specified by:
isPartialResourceConfiguredin interfaceExecutionPlan- Returns:
- true if partial resource configuration is set; false otherwise
-
isEmpty
public boolean isEmpty()Description copied from interface:ExecutionPlanChecks if the execution plan is empty.- Specified by:
isEmptyin interfaceExecutionPlan- Returns:
- true if the plan is empty; false otherwise
-
setJobConfiguration
public void setJobConfiguration(org.apache.flink.configuration.Configuration jobConfiguration) -
getJobConfiguration
public org.apache.flink.configuration.Configuration getJobConfiguration()Returns the configuration object for this job. Job-wide parameters should be set into that configuration object.- Specified by:
getJobConfigurationin interfaceExecutionPlan- Returns:
- The configuration object for this job.
-
getSerializedExecutionConfig
public org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> getSerializedExecutionConfig()Returns theExecutionConfig.- Specified by:
getSerializedExecutionConfigin interfaceExecutionPlan- Returns:
- ExecutionConfig
-
setJobType
-
getJobType
Description copied from interface:ExecutionPlanGets the type of the job.- Specified by:
getJobTypein interfaceExecutionPlan- Returns:
- the job type
-
setDynamic
public void setDynamic(boolean dynamic) -
isDynamic
public boolean isDynamic()Description copied from interface:ExecutionPlanChecks if the execution plan is dynamic.- Specified by:
isDynamicin interfaceExecutionPlan- Returns:
- true if the execution plan is dynamic; false otherwise
-
enableApproximateLocalRecovery
public void enableApproximateLocalRecovery(boolean enabled) -
isApproximateLocalRecoveryEnabled
public boolean isApproximateLocalRecoveryEnabled() -
setSavepointRestoreSettings
Sets the savepoint restore settings.- Specified by:
setSavepointRestoreSettingsin interfaceExecutionPlan- Parameters:
settings- The savepoint restore settings.
-
getSavepointRestoreSettings
Returns the configured savepoint restore setting.- Specified by:
getSavepointRestoreSettingsin interfaceExecutionPlan- Returns:
- The configured savepoint restore settings.
-
setExecutionConfig
public void setExecutionConfig(org.apache.flink.api.common.ExecutionConfig executionConfig) throws IOException Sets the execution config. This method eagerly serialized the ExecutionConfig for future RPC transport. Further modification of the referenced ExecutionConfig object will not affect this serialized copy.- Parameters:
executionConfig- The ExecutionConfig to be serialized.- Throws:
IOException- Thrown if the serialization of the ExecutionConfig fails
-
setSerializedExecutionConfig
public void setSerializedExecutionConfig(org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> serializedExecutionConfig) -
addVertex
Adds a new task vertex to the job graph if it is not already included.- Parameters:
vertex- the new task vertex to be added
-
getVertices
Returns an Iterable to iterate all vertices registered with the job graph.- Returns:
- an Iterable to iterate all vertices registered with the job graph
-
getVerticesAsArray
Returns an array of all job vertices that are registered with the job graph. The order in which the vertices appear in the list is not defined.- Returns:
- an array of all job vertices that are registered with the job graph
-
getNumberOfVertices
public int getNumberOfVertices()Returns the number of all vertices.- Returns:
- The number of all vertices.
-
getSlotSharingGroups
-
getCoLocationGroups
Returns allCoLocationGroupinstances associated with thisJobGraph.- Returns:
- The associated
CoLocationGroupinstances.
-
setSnapshotSettings
Sets the settings for asynchronous snapshots. A value ofnullmeans that snapshotting is not enabled.- Parameters:
settings- The snapshot settings
-
getCheckpointingSettings
Gets the settings for asynchronous snapshots. This method returns null, when checkpointing is not enabled.- Specified by:
getCheckpointingSettingsin interfaceExecutionPlan- Returns:
- The snapshot settings
-
findVertexByID
Searches for a vertex with a matching ID and returns it.- Parameters:
id- the ID of the vertex to search for- Returns:
- the vertex with the matching ID or
nullif no vertex with such ID could be found
-
setClasspaths
Sets the classpaths required to run the job on a task manager.- Parameters:
paths- paths of the directories/JAR files required to run the job on a task manager
-
getClasspaths
Description copied from interface:ExecutionPlanGets the classpath required for the job.- Specified by:
getClasspathsin interfaceExecutionPlan- Returns:
- a list of classpath URLs
-
getMaximumParallelism
public int getMaximumParallelism()Gets the maximum parallelism of all operations in this job graph.- Specified by:
getMaximumParallelismin interfaceExecutionPlan- Returns:
- The maximum parallelism of this job graph
-
getVerticesSortedTopologicallyFromSources
public List<JobVertex> getVerticesSortedTopologicallyFromSources() throws org.apache.flink.api.common.InvalidProgramException- Throws:
org.apache.flink.api.common.InvalidProgramException
-
addJar
public void addJar(org.apache.flink.core.fs.Path jar) Adds the path of a JAR file required to run the job on a task manager.- Parameters:
jar- path of the JAR file required to run the job on a task manager
-
addJars
Adds the given jar files to theJobGraphviaaddJar(org.apache.flink.core.fs.Path).- Parameters:
jarFilesToAttach- a list of theURLsof the jar files to attach to the jobgraph.- Throws:
RuntimeException- if a jar URL is not valid.
-
getUserJars
Gets the list of assigned user jar paths.- Specified by:
getUserJarsin interfaceExecutionPlan- Returns:
- The list of assigned user jar paths
-
addUserArtifact
public void addUserArtifact(String name, org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry file) Adds the path of a custom file required to run the job on a task manager.- Parameters:
name- a name under which this artifact will be accessible throughDistributedCachefile- path of a custom file required to run the job on a task manager
-
getUserArtifacts
public Map<String,org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry> getUserArtifacts()Gets the list of assigned user jar paths.- Specified by:
getUserArtifactsin interfaceExecutionPlan- Returns:
- The list of assigned user jar paths
-
addUserJarBlobKey
Adds the BLOB referenced by the key to the JobGraph's dependencies.- Specified by:
addUserJarBlobKeyin interfaceExecutionPlan- Parameters:
key- path of the JAR file required to run the job on a task manager
-
hasUsercodeJarFiles
public boolean hasUsercodeJarFiles()Checks whether the JobGraph has user code JAR files attached.- Returns:
- True, if the JobGraph has user code JAR files attached, false otherwise.
-
getUserJarBlobKeys
Returns a set of BLOB keys referring to the JAR files required to run this job.- Specified by:
getUserJarBlobKeysin interfaceExecutionPlan- Returns:
- set of BLOB keys referring to the JAR files required to run this job
-
toString
-
setUserArtifactBlobKey
Description copied from interface:ExecutionPlanSets a user artifact blob key for a specified user artifact.- Specified by:
setUserArtifactBlobKeyin interfaceExecutionPlan- Parameters:
entryName- the name of the user artifactblobKey- the blob key corresponding to the user artifact- Throws:
IOException- if an error occurs during the operation
-
setUserArtifactRemotePath
-
writeUserArtifactEntriesToConfiguration
public void writeUserArtifactEntriesToConfiguration()Description copied from interface:ExecutionPlanWrites user artifact entries to the job configuration.- Specified by:
writeUserArtifactEntriesToConfigurationin interfaceExecutionPlan
-
setJobStatusHooks
-
getJobStatusHooks
-
setInitialClientHeartbeatTimeout
public void setInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout) -
getInitialClientHeartbeatTimeout
public long getInitialClientHeartbeatTimeout()Description copied from interface:ExecutionPlanGets the initial client heartbeat timeout.- Specified by:
getInitialClientHeartbeatTimeoutin interfaceExecutionPlan- Returns:
- the timeout duration in milliseconds
-