Class BeamPythonFunctionRunner
java.lang.Object
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner
- All Implemented Interfaces:
AutoCloseable,PythonFunctionRunner
- Direct Known Subclasses:
BeamDataStreamPythonFunctionRunner,BeamTablePythonFunctionRunner
@Internal
public abstract class BeamPythonFunctionRunner
extends Object
implements PythonFunctionRunner
A
BeamPythonFunctionRunner used to execute Python functions.-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final FlinkFnApi.CoderInfoDescriptorprotected static final org.slf4j.Loggerprotected org.apache.beam.sdk.fn.data.FnDataReceiver<org.apache.beam.sdk.util.WindowedValue<byte[]>>The receiver which forwards the input elements to a remote environment for processing.protected final FlinkFnApi.CoderInfoDescriptorprotected LinkedBlockingQueue<org.apache.flink.api.java.tuple.Tuple2<String,byte[]>> Buffers the Python function execution result which has still not been processed.protected final Map<String,FlinkFnApi.CoderInfoDescriptor> -
Constructor Summary
ConstructorsConstructorDescriptionBeamPythonFunctionRunner(org.apache.flink.runtime.execution.Environment environment, String taskName, ProcessPythonEnvironmentManager environmentManager, FlinkMetricContainer flinkMetricContainer, org.apache.flink.runtime.state.KeyedStateBackend<?> keyedStateBackend, org.apache.flink.runtime.state.OperatorStateBackend operatorStateBackend, org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer, org.apache.flink.api.common.typeutils.TypeSerializer<?> namespaceSerializer, TimerRegistration timerRegistration, org.apache.flink.runtime.memory.MemoryManager memoryManager, double managedMemoryFraction, FlinkFnApi.CoderInfoDescriptor inputCoderDescriptor, FlinkFnApi.CoderInfoDescriptor outputCoderDescriptor, Map<String, FlinkFnApi.CoderInfoDescriptor> sideOutputCoderDescriptors) -
Method Summary
Modifier and TypeMethodDescriptionprotected abstract voidbuildTransforms(org.apache.beam.model.pipeline.v1.RunnerApi.Components.Builder componentsBuilder) voidclose()Tear-down the Python function runner.org.apache.beam.runners.fnexecution.control.JobBundleFactorycreateJobBundleFactory(org.apache.beam.vendor.grpc.v1p60p1.com.google.protobuf.Struct pipelineOptions) voidvoidflush()Forces to finish the processing of the current bundle of elements.protected abstract Optional<org.apache.beam.model.pipeline.v1.RunnerApi.Coder>protected abstract List<org.apache.beam.runners.core.construction.graph.TimerReference>getTimers(org.apache.beam.model.pipeline.v1.RunnerApi.Components components) voidInterrupts the progress of takeResult.voidopen(org.apache.flink.configuration.ReadableConfig config) Prepares the Python function runner, such as preparing the Python execution environment, etc.Retrieves the Python function result.voidprocess(byte[] data) Executes the Python function with the input byte array.voidprocessTimer(byte[] timerData) Send the triggered timer to the Python function.protected voidRetrieves the Python function result, waiting if necessary until an element becomes available.
-
Field Details
-
LOG
protected static final org.slf4j.Logger LOG -
inputCoderDescriptor
-
outputCoderDescriptor
-
sideOutputCoderDescriptors
-
resultBuffer
@VisibleForTesting protected transient LinkedBlockingQueue<org.apache.flink.api.java.tuple.Tuple2<String,byte[]>> resultBufferBuffers the Python function execution result which has still not been processed. -
mainInputReceiver
@VisibleForTesting protected transient org.apache.beam.sdk.fn.data.FnDataReceiver<org.apache.beam.sdk.util.WindowedValue<byte[]>> mainInputReceiverThe receiver which forwards the input elements to a remote environment for processing.
-
-
Constructor Details
-
BeamPythonFunctionRunner
public BeamPythonFunctionRunner(org.apache.flink.runtime.execution.Environment environment, String taskName, ProcessPythonEnvironmentManager environmentManager, @Nullable FlinkMetricContainer flinkMetricContainer, @Nullable org.apache.flink.runtime.state.KeyedStateBackend<?> keyedStateBackend, @Nullable org.apache.flink.runtime.state.OperatorStateBackend operatorStateBackend, @Nullable org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer, @Nullable org.apache.flink.api.common.typeutils.TypeSerializer<?> namespaceSerializer, @Nullable TimerRegistration timerRegistration, org.apache.flink.runtime.memory.MemoryManager memoryManager, double managedMemoryFraction, FlinkFnApi.CoderInfoDescriptor inputCoderDescriptor, FlinkFnApi.CoderInfoDescriptor outputCoderDescriptor, Map<String, FlinkFnApi.CoderInfoDescriptor> sideOutputCoderDescriptors)
-
-
Method Details
-
open
Description copied from interface:PythonFunctionRunnerPrepares the Python function runner, such as preparing the Python execution environment, etc.- Specified by:
openin interfacePythonFunctionRunner- Throws:
Exception
-
close
Description copied from interface:PythonFunctionRunnerTear-down the Python function runner.- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfacePythonFunctionRunner- Throws:
Exception
-
process
Description copied from interface:PythonFunctionRunnerExecutes the Python function with the input byte array.- Specified by:
processin interfacePythonFunctionRunner- Parameters:
data- the byte array data.- Throws:
Exception
-
drainUnregisteredTimers
public void drainUnregisteredTimers()- Specified by:
drainUnregisteredTimersin interfacePythonFunctionRunner
-
processTimer
Description copied from interface:PythonFunctionRunnerSend the triggered timer to the Python function.- Specified by:
processTimerin interfacePythonFunctionRunner- Throws:
Exception
-
startBundle
@VisibleForTesting protected void startBundle() -
pollResult
Description copied from interface:PythonFunctionRunnerRetrieves the Python function result.- Specified by:
pollResultin interfacePythonFunctionRunner- Returns:
- the head of he Python function result buffer, or
nullif the result buffer is empty. f0 means the byte array buffer which stores the Python function result. f1 means the length of the Python function result byte array. - Throws:
Exception
-
takeResult
Description copied from interface:PythonFunctionRunnerRetrieves the Python function result, waiting if necessary until an element becomes available.- Specified by:
takeResultin interfacePythonFunctionRunner- Returns:
- the head of he Python function result buffer. f0 means the byte array buffer which stores the Python function result. f1 means the length of the Python function result byte array.
- Throws:
Exception
-
flush
Description copied from interface:PythonFunctionRunnerForces to finish the processing of the current bundle of elements. It will flush the data cached in the data buffer for processing and retrieves the state mutations (if exists) made by the Python function. The call blocks until all of the outputs produced by this bundle have been received.- Specified by:
flushin interfacePythonFunctionRunner- Throws:
Exception
-
notifyNoMoreResults
public void notifyNoMoreResults()Interrupts the progress of takeResult. -
buildTransforms
protected abstract void buildTransforms(org.apache.beam.model.pipeline.v1.RunnerApi.Components.Builder componentsBuilder) -
getTimers
protected abstract List<org.apache.beam.runners.core.construction.graph.TimerReference> getTimers(org.apache.beam.model.pipeline.v1.RunnerApi.Components components) -
getOptionalTimerCoderProto
protected abstract Optional<org.apache.beam.model.pipeline.v1.RunnerApi.Coder> getOptionalTimerCoderProto() -
createJobBundleFactory
@VisibleForTesting public org.apache.beam.runners.fnexecution.control.JobBundleFactory createJobBundleFactory(org.apache.beam.vendor.grpc.v1p60p1.com.google.protobuf.Struct pipelineOptions) throws Exception - Throws:
Exception
-