Class StreamGraph

java.lang.Object
org.apache.flink.streaming.api.graph.StreamGraph
All Implemented Interfaces:
Serializable, org.apache.flink.api.dag.Pipeline, ExecutionPlan

@Internal public class StreamGraph extends Object implements org.apache.flink.api.dag.Pipeline, ExecutionPlan
Class representing the streaming topology. It contains all the information necessary to build the jobgraph for the execution.
See Also:
  • Field Details

  • Constructor Details

    • StreamGraph

      public StreamGraph(org.apache.flink.configuration.Configuration jobConfiguration, org.apache.flink.api.common.ExecutionConfig executionConfig, CheckpointConfig checkpointConfig, SavepointRestoreSettings savepointRestoreSettings)
  • Method Details

    • clear

      public void clear()
      Remove all registered nodes etc.
    • getExecutionConfig

      public org.apache.flink.api.common.ExecutionConfig getExecutionConfig()
    • getJobConfiguration

      public org.apache.flink.configuration.Configuration getJobConfiguration()
      Description copied from interface: ExecutionPlan
      Gets the job configuration.
      Specified by:
      getJobConfiguration in interface ExecutionPlan
      Returns:
      the job configuration
    • getCheckpointConfig

      public CheckpointConfig getCheckpointConfig()
    • getCheckpointingMode

      public org.apache.flink.core.execution.CheckpointingMode getCheckpointingMode()
    • getCheckpointingMode

      public static org.apache.flink.core.execution.CheckpointingMode getCheckpointingMode(CheckpointConfig checkpointConfig)
    • addJar

      public void addJar(org.apache.flink.core.fs.Path jar)
      Adds the path of a JAR file required to run the job on a task manager.
      Parameters:
      jar - path of the JAR file required to run the job on a task manager
    • getUserJars

      public List<org.apache.flink.core.fs.Path> getUserJars()
      Gets the list of assigned user jar paths.
      Specified by:
      getUserJars in interface ExecutionPlan
      Returns:
      The list of assigned user jar paths
    • createJobCheckpointingSettings

      public void createJobCheckpointingSettings()
    • setSavepointRestoreSettings

      public void setSavepointRestoreSettings(SavepointRestoreSettings savepointRestoreSettings)
      Description copied from interface: ExecutionPlan
      Sets the settings for restoring from a savepoint.
      Specified by:
      setSavepointRestoreSettings in interface ExecutionPlan
      Parameters:
      savepointRestoreSettings - the settings for savepoint restoration
    • getSerializedExecutionConfig

      public org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> getSerializedExecutionConfig()
      Description copied from interface: ExecutionPlan
      Gets the serialized execution configuration.
      Specified by:
      getSerializedExecutionConfig in interface ExecutionPlan
      Returns:
      The serialized execution configuration object
    • getSavepointRestoreSettings

      public SavepointRestoreSettings getSavepointRestoreSettings()
      Description copied from interface: ExecutionPlan
      Gets the settings for restoring from a savepoint.
      Specified by:
      getSavepointRestoreSettings in interface ExecutionPlan
      Returns:
      the savepoint restore settings
    • getJobName

      public String getJobName()
    • setJobName

      public void setJobName(String jobName)
    • getLineageGraph

      public LineageGraph getLineageGraph()
    • setLineageGraph

      public void setLineageGraph(LineageGraph lineageGraph)
    • setStateBackend

      public void setStateBackend(StateBackend backend)
    • getStateBackend

      @VisibleForTesting public StateBackend getStateBackend()
    • setCheckpointStorage

      public void setCheckpointStorage(CheckpointStorage checkpointStorage)
    • getTimerServiceProvider

      public InternalTimeServiceManager.Provider getTimerServiceProvider()
    • setTimerServiceProvider

      public void setTimerServiceProvider(InternalTimeServiceManager.Provider timerServiceProvider)
    • getGlobalStreamExchangeMode

      public GlobalStreamExchangeMode getGlobalStreamExchangeMode()
    • setGlobalStreamExchangeMode

      public void setGlobalStreamExchangeMode(GlobalStreamExchangeMode globalExchangeMode)
    • setSlotSharingGroupResource

      public void setSlotSharingGroupResource(Map<String,ResourceProfile> slotSharingGroupResources)
    • getSlotSharingGroupResource

      public Optional<ResourceProfile> getSlotSharingGroupResource(String groupId)
    • hasFineGrainedResource

      public boolean hasFineGrainedResource()
    • setAllVerticesInSameSlotSharingGroupByDefault

      public void setAllVerticesInSameSlotSharingGroupByDefault(boolean allVerticesInSameSlotSharingGroupByDefault)
      Set whether to put all vertices into the same slot sharing group by default.
      Parameters:
      allVerticesInSameSlotSharingGroupByDefault - indicates whether to put all vertices into the same slot sharing group by default.
    • isAllVerticesInSameSlotSharingGroupByDefault

      public boolean isAllVerticesInSameSlotSharingGroupByDefault()
      Gets whether to put all vertices into the same slot sharing group by default.
      Returns:
      whether to put all vertices into the same slot sharing group by default.
    • isEnableCheckpointsAfterTasksFinish

      public boolean isEnableCheckpointsAfterTasksFinish()
    • setEnableCheckpointsAfterTasksFinish

      public void setEnableCheckpointsAfterTasksFinish(boolean enableCheckpointsAfterTasksFinish)
    • isChainingEnabled

      public boolean isChainingEnabled()
    • isChainingOfOperatorsWithDifferentMaxParallelismEnabled

      public boolean isChainingOfOperatorsWithDifferentMaxParallelismEnabled()
    • isIterative

      public boolean isIterative()
    • addSource

      public <IN, OUT> void addSource(Integer vertexID, @Nullable String slotSharingGroup, @Nullable String coLocationGroup, SourceOperatorFactory<OUT> operatorFactory, org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo, org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo, String operatorName)
    • addLegacySource

      public <IN, OUT> void addLegacySource(Integer vertexID, @Nullable String slotSharingGroup, @Nullable String coLocationGroup, StreamOperatorFactory<OUT> operatorFactory, org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo, org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo, String operatorName)
    • addSink

      public <IN, OUT> void addSink(Integer vertexID, @Nullable String slotSharingGroup, @Nullable String coLocationGroup, StreamOperatorFactory<OUT> operatorFactory, org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo, org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo, String operatorName)
    • addOperator

      public <IN, OUT> void addOperator(Integer vertexID, @Nullable String slotSharingGroup, @Nullable String coLocationGroup, StreamOperatorFactory<OUT> operatorFactory, org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo, org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo, String operatorName)
    • addCoOperator

      public <IN1, IN2, OUT> void addCoOperator(Integer vertexID, String slotSharingGroup, @Nullable String coLocationGroup, StreamOperatorFactory<OUT> taskOperatorFactory, org.apache.flink.api.common.typeinfo.TypeInformation<IN1> in1TypeInfo, org.apache.flink.api.common.typeinfo.TypeInformation<IN2> in2TypeInfo, org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo, String operatorName)
    • addMultipleInputOperator

      public <OUT> void addMultipleInputOperator(Integer vertexID, String slotSharingGroup, @Nullable String coLocationGroup, StreamOperatorFactory<OUT> operatorFactory, List<org.apache.flink.api.common.typeinfo.TypeInformation<?>> inTypeInfos, org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo, String operatorName)
    • addNode

      protected StreamNode addNode(Integer vertexID, @Nullable String slotSharingGroup, @Nullable String coLocationGroup, Class<? extends TaskInvokable> vertexClass, @Nullable StreamOperatorFactory<?> operatorFactory, String operatorName)
    • addVirtualSideOutputNode

      public void addVirtualSideOutputNode(Integer originalId, Integer virtualId, org.apache.flink.util.OutputTag outputTag)
      Adds a new virtual node that is used to connect a downstream vertex to only the outputs with the selected side-output OutputTag.
      Parameters:
      originalId - ID of the node that should be connected to.
      virtualId - ID of the virtual node.
      outputTag - The selected side-output OutputTag.
    • addVirtualPartitionNode

      public void addVirtualPartitionNode(Integer originalId, Integer virtualId, StreamPartitioner<?> partitioner, StreamExchangeMode exchangeMode)
      Adds a new virtual node that is used to connect a downstream vertex to an input with a certain partitioning.

      When adding an edge from the virtual node to a downstream node the connection will be made to the original node, but with the partitioning given here.

      Parameters:
      originalId - ID of the node that should be connected to.
      virtualId - ID of the virtual node.
      partitioner - The partitioner
    • getSlotSharingGroup

      public String getSlotSharingGroup(Integer id)
      Determines the slot sharing group of an operation across virtual nodes.
    • addEdge

      public void addEdge(Integer upStreamVertexID, Integer downStreamVertexID, int typeNumber)
    • addEdge

      public void addEdge(Integer upStreamVertexID, Integer downStreamVertexID, int typeNumber, IntermediateDataSetID intermediateDataSetId)
    • setParallelism

      public void setParallelism(Integer vertexID, int parallelism)
    • isDynamic

      public boolean isDynamic()
      Description copied from interface: ExecutionPlan
      Checks if the execution plan is dynamic.
      Specified by:
      isDynamic in interface ExecutionPlan
      Returns:
      true if the execution plan is dynamic; false otherwise
    • getCheckpointingSettings

      public JobCheckpointingSettings getCheckpointingSettings()
      Description copied from interface: ExecutionPlan
      Gets the settings for job checkpointing.
      Specified by:
      getCheckpointingSettings in interface ExecutionPlan
      Returns:
      the checkpointing settings
    • isEmpty

      public boolean isEmpty()
      Description copied from interface: ExecutionPlan
      Checks if the execution plan is empty.
      Specified by:
      isEmpty in interface ExecutionPlan
      Returns:
      true if the plan is empty; false otherwise
    • setParallelism

      public void setParallelism(Integer vertexId, int parallelism, boolean parallelismConfigured)
    • setDynamic

      public void setDynamic(boolean dynamic)
    • setMaxParallelism

      public void setMaxParallelism(int vertexID, int maxParallelism)
    • setResources

      public void setResources(int vertexID, org.apache.flink.api.common.operators.ResourceSpec minResources, org.apache.flink.api.common.operators.ResourceSpec preferredResources)
    • setManagedMemoryUseCaseWeights

      public void setManagedMemoryUseCaseWeights(int vertexID, Map<org.apache.flink.core.memory.ManagedMemoryUseCase,Integer> operatorScopeUseCaseWeights, Set<org.apache.flink.core.memory.ManagedMemoryUseCase> slotScopeUseCases)
    • setOneInputStateKey

      public void setOneInputStateKey(Integer vertexID, org.apache.flink.api.java.functions.KeySelector<?,?> keySelector, org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer)
    • setTwoInputStateKey

      public void setTwoInputStateKey(Integer vertexID, org.apache.flink.api.java.functions.KeySelector<?,?> keySelector1, org.apache.flink.api.java.functions.KeySelector<?,?> keySelector2, org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer)
    • setMultipleInputStateKey

      public void setMultipleInputStateKey(Integer vertexID, List<org.apache.flink.api.java.functions.KeySelector<?,?>> keySelectors, org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer)
    • setBufferTimeout

      public void setBufferTimeout(Integer vertexID, long bufferTimeout)
    • setSerializers

      public void setSerializers(Integer vertexID, org.apache.flink.api.common.typeutils.TypeSerializer<?> in1, org.apache.flink.api.common.typeutils.TypeSerializer<?> in2, org.apache.flink.api.common.typeutils.TypeSerializer<?> out)
    • setInputFormat

      public void setInputFormat(Integer vertexID, org.apache.flink.api.common.io.InputFormat<?,?> inputFormat)
    • setOutputFormat

      public void setOutputFormat(Integer vertexID, org.apache.flink.api.common.io.OutputFormat<?> outputFormat)
    • setTransformationUID

      public void setTransformationUID(Integer nodeId, String transformationId)
    • getStreamNode

      public StreamNode getStreamNode(Integer vertexID)
    • getVertexIDs

      protected Collection<? extends Integer> getVertexIDs()
    • getStreamEdges

      @VisibleForTesting public List<StreamEdge> getStreamEdges(int sourceId)
    • getStreamEdges

      public List<StreamEdge> getStreamEdges(int sourceId, int targetId)
    • getStreamEdgesOrThrow

      @VisibleForTesting @Deprecated public List<StreamEdge> getStreamEdgesOrThrow(int sourceId, int targetId)
      Deprecated.
    • getSourceIDs

      public Collection<Integer> getSourceIDs()
    • getSinkIDs

      public Collection<Integer> getSinkIDs()
    • getStreamNodes

      public Collection<StreamNode> getStreamNodes()
    • getBrokerID

      public String getBrokerID(Integer vertexID)
    • getLoopTimeout

      public long getLoopTimeout(Integer vertexID)
    • getSourceVertex

      public StreamNode getSourceVertex(StreamEdge edge)
    • getTargetVertex

      public StreamNode getTargetVertex(StreamEdge edge)
    • getJobGraph

      @VisibleForTesting public JobGraph getJobGraph()
      Gets the assembled JobGraph with a random JobID.
    • getJobGraph

      public JobGraph getJobGraph(ClassLoader userClassLoader)
    • getJobGraph

      public JobGraph getJobGraph(ClassLoader userClassLoader, @Nullable org.apache.flink.api.common.JobID jobID)
      Gets the assembled JobGraph with a specified JobID.
    • getStreamingPlanAsJSON

      public String getStreamingPlanAsJSON()
    • setJobType

      public void setJobType(JobType jobType)
    • getName

      public String getName()
      Description copied from interface: ExecutionPlan
      Gets the name of the job.
      Specified by:
      getName in interface ExecutionPlan
      Returns:
      the job name
    • getJobType

      public JobType getJobType()
      Description copied from interface: ExecutionPlan
      Gets the type of the job.
      Specified by:
      getJobType in interface ExecutionPlan
      Returns:
      the job type
    • isAutoParallelismEnabled

      public boolean isAutoParallelismEnabled()
    • setAutoParallelismEnabled

      public void setAutoParallelismEnabled(boolean autoParallelismEnabled)
    • getVertexDescriptionMode

      public org.apache.flink.configuration.PipelineOptions.VertexDescriptionMode getVertexDescriptionMode()
    • setVertexDescriptionMode

      public void setVertexDescriptionMode(org.apache.flink.configuration.PipelineOptions.VertexDescriptionMode mode)
    • setVertexNameIncludeIndexPrefix

      public void setVertexNameIncludeIndexPrefix(boolean includePrefix)
    • isVertexNameIncludeIndexPrefix

      public boolean isVertexNameIncludeIndexPrefix()
    • registerJobStatusHook

      public void registerJobStatusHook(org.apache.flink.core.execution.JobStatusHook hook)
      Registers the JobStatusHook.
    • getJobStatusHooks

      public List<org.apache.flink.core.execution.JobStatusHook> getJobStatusHooks()
    • setSupportsConcurrentExecutionAttempts

      public void setSupportsConcurrentExecutionAttempts(Integer vertexId, boolean supportsConcurrentExecutionAttempts)
    • setAttribute

      public void setAttribute(Integer vertexId, org.apache.flink.api.common.attribute.Attribute attribute)
    • setJobId

      public void setJobId(org.apache.flink.api.common.JobID jobId)
    • getJobID

      public org.apache.flink.api.common.JobID getJobID()
      Description copied from interface: ExecutionPlan
      Gets the unique identifier of the job.
      Specified by:
      getJobID in interface ExecutionPlan
      Returns:
      the job id
    • setClasspath

      public void setClasspath(List<URL> paths)
      Sets the classpath required to run the job on a task manager.
      Parameters:
      paths - paths of the directories/JAR files required to run the job on a task manager
    • getClasspath

      public List<URL> getClasspath()
    • addJars

      public void addJars(List<URL> jarFilesToAttach)
      Adds the given jar files to the JobGraph via JobGraph.addJar(org.apache.flink.core.fs.Path).
      Parameters:
      jarFilesToAttach - a list of the URLs of the jar files to attach to the jobgraph.
      Throws:
      RuntimeException - if a jar URL is not valid.
    • getUserJarBlobKeys

      public List<PermanentBlobKey> getUserJarBlobKeys()
      Returns a list of BLOB keys referring to the JAR files required to run this job.
      Specified by:
      getUserJarBlobKeys in interface ExecutionPlan
      Returns:
      list of BLOB keys referring to the JAR files required to run this job
    • getClasspaths

      public List<URL> getClasspaths()
      Description copied from interface: ExecutionPlan
      Gets the classpath required for the job.
      Specified by:
      getClasspaths in interface ExecutionPlan
      Returns:
      a list of classpath URLs
    • addUserArtifact

      public void addUserArtifact(String name, org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry file)
    • getUserArtifacts

      public Map<String,org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry> getUserArtifacts()
      Description copied from interface: ExecutionPlan
      Gets the user artifacts associated with the job.
      Specified by:
      getUserArtifacts in interface ExecutionPlan
      Returns:
      a map of user artifacts
    • addUserJarBlobKey

      public void addUserJarBlobKey(PermanentBlobKey key)
      Description copied from interface: ExecutionPlan
      Adds a blob key corresponding to a user JAR.
      Specified by:
      addUserJarBlobKey in interface ExecutionPlan
      Parameters:
      key - the blob key to add
    • setUserArtifactBlobKey

      public void setUserArtifactBlobKey(String entryName, PermanentBlobKey blobKey) throws IOException
      Description copied from interface: ExecutionPlan
      Sets a user artifact blob key for a specified user artifact.
      Specified by:
      setUserArtifactBlobKey in interface ExecutionPlan
      Parameters:
      entryName - the name of the user artifact
      blobKey - the blob key corresponding to the user artifact
      Throws:
      IOException - if an error occurs during the operation
    • writeUserArtifactEntriesToConfiguration

      public void writeUserArtifactEntriesToConfiguration()
      Description copied from interface: ExecutionPlan
      Writes user artifact entries to the job configuration.
      Specified by:
      writeUserArtifactEntriesToConfiguration in interface ExecutionPlan
    • getMaximumParallelism

      public int getMaximumParallelism()
      Description copied from interface: ExecutionPlan
      Gets the maximum parallelism level for the job.
      Specified by:
      getMaximumParallelism in interface ExecutionPlan
      Returns:
      the maximum parallelism
    • setInitialClientHeartbeatTimeout

      public void setInitialClientHeartbeatTimeout(long initialClientHeartbeatTimeout)
    • getInitialClientHeartbeatTimeout

      public long getInitialClientHeartbeatTimeout()
      Description copied from interface: ExecutionPlan
      Gets the initial client heartbeat timeout.
      Specified by:
      getInitialClientHeartbeatTimeout in interface ExecutionPlan
      Returns:
      the timeout duration in milliseconds
    • isPartialResourceConfigured

      public boolean isPartialResourceConfigured()
      Description copied from interface: ExecutionPlan
      Checks if partial resource configuration is specified.
      Specified by:
      isPartialResourceConfigured in interface ExecutionPlan
      Returns:
      true if partial resource configuration is set; false otherwise
    • serializeUserDefinedInstances

      public void serializeUserDefinedInstances() throws IOException
      Throws:
      IOException
    • deserializeUserDefinedInstances

      public void deserializeUserDefinedInstances(ClassLoader userClassLoader, Executor serializationExecutor) throws Exception
      Throws:
      Exception
    • getStreamNodesSortedTopologicallyFromSources

      public List<StreamNode> getStreamNodesSortedTopologicallyFromSources() throws org.apache.flink.api.common.InvalidProgramException
      Throws:
      org.apache.flink.api.common.InvalidProgramException
    • serializeAndSaveWatermarkDeclarations

      public void serializeAndSaveWatermarkDeclarations()
    • getSerializedWatermarkDeclarations

      public byte[] getSerializedWatermarkDeclarations()
      Get serialized watermark declarations, note that it may be null.
    • toString

      public String toString()
      Overrides:
      toString in class Object