Class BeamPythonFunctionRunner

java.lang.Object
org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner
All Implemented Interfaces:
AutoCloseable, PythonFunctionRunner
Direct Known Subclasses:
BeamDataStreamPythonFunctionRunner, BeamTablePythonFunctionRunner

@Internal public abstract class BeamPythonFunctionRunner extends Object implements PythonFunctionRunner
A BeamPythonFunctionRunner used to execute Python functions.
  • Field Details

    • LOG

      protected static final org.slf4j.Logger LOG
    • inputCoderDescriptor

      protected final FlinkFnApi.CoderInfoDescriptor inputCoderDescriptor
    • outputCoderDescriptor

      protected final FlinkFnApi.CoderInfoDescriptor outputCoderDescriptor
    • sideOutputCoderDescriptors

      protected final Map<String,FlinkFnApi.CoderInfoDescriptor> sideOutputCoderDescriptors
    • resultBuffer

      @VisibleForTesting protected transient LinkedBlockingQueue<org.apache.flink.api.java.tuple.Tuple2<String,byte[]>> resultBuffer
      Buffers the Python function execution result which has still not been processed.
    • mainInputReceiver

      @VisibleForTesting protected transient org.apache.beam.sdk.fn.data.FnDataReceiver<org.apache.beam.sdk.util.WindowedValue<byte[]>> mainInputReceiver
      The receiver which forwards the input elements to a remote environment for processing.
  • Constructor Details

    • BeamPythonFunctionRunner

      public BeamPythonFunctionRunner(org.apache.flink.runtime.execution.Environment environment, String taskName, ProcessPythonEnvironmentManager environmentManager, @Nullable FlinkMetricContainer flinkMetricContainer, @Nullable org.apache.flink.runtime.state.KeyedStateBackend<?> keyedStateBackend, @Nullable org.apache.flink.runtime.state.OperatorStateBackend operatorStateBackend, @Nullable org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer, @Nullable org.apache.flink.api.common.typeutils.TypeSerializer<?> namespaceSerializer, @Nullable TimerRegistration timerRegistration, org.apache.flink.runtime.memory.MemoryManager memoryManager, double managedMemoryFraction, FlinkFnApi.CoderInfoDescriptor inputCoderDescriptor, FlinkFnApi.CoderInfoDescriptor outputCoderDescriptor, Map<String,FlinkFnApi.CoderInfoDescriptor> sideOutputCoderDescriptors)
  • Method Details

    • open

      public void open(org.apache.flink.configuration.ReadableConfig config) throws Exception
      Description copied from interface: PythonFunctionRunner
      Prepares the Python function runner, such as preparing the Python execution environment, etc.
      Specified by:
      open in interface PythonFunctionRunner
      Throws:
      Exception
    • close

      public void close() throws Exception
      Description copied from interface: PythonFunctionRunner
      Tear-down the Python function runner.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface PythonFunctionRunner
      Throws:
      Exception
    • process

      public void process(byte[] data) throws Exception
      Description copied from interface: PythonFunctionRunner
      Executes the Python function with the input byte array.
      Specified by:
      process in interface PythonFunctionRunner
      Parameters:
      data - the byte array data.
      Throws:
      Exception
    • drainUnregisteredTimers

      public void drainUnregisteredTimers()
      Specified by:
      drainUnregisteredTimers in interface PythonFunctionRunner
    • processTimer

      public void processTimer(byte[] timerData) throws Exception
      Description copied from interface: PythonFunctionRunner
      Send the triggered timer to the Python function.
      Specified by:
      processTimer in interface PythonFunctionRunner
      Throws:
      Exception
    • startBundle

      @VisibleForTesting protected void startBundle()
    • pollResult

      public org.apache.flink.api.java.tuple.Tuple3<String,byte[],Integer> pollResult() throws Exception
      Description copied from interface: PythonFunctionRunner
      Retrieves the Python function result.
      Specified by:
      pollResult in interface PythonFunctionRunner
      Returns:
      the head of he Python function result buffer, or null if the result buffer is empty. f0 means the byte array buffer which stores the Python function result. f1 means the length of the Python function result byte array.
      Throws:
      Exception
    • takeResult

      public org.apache.flink.api.java.tuple.Tuple3<String,byte[],Integer> takeResult() throws Exception
      Description copied from interface: PythonFunctionRunner
      Retrieves the Python function result, waiting if necessary until an element becomes available.
      Specified by:
      takeResult in interface PythonFunctionRunner
      Returns:
      the head of he Python function result buffer. f0 means the byte array buffer which stores the Python function result. f1 means the length of the Python function result byte array.
      Throws:
      Exception
    • flush

      public void flush() throws Exception
      Description copied from interface: PythonFunctionRunner
      Forces to finish the processing of the current bundle of elements. It will flush the data cached in the data buffer for processing and retrieves the state mutations (if exists) made by the Python function. The call blocks until all of the outputs produced by this bundle have been received.
      Specified by:
      flush in interface PythonFunctionRunner
      Throws:
      Exception
    • notifyNoMoreResults

      public void notifyNoMoreResults()
      Interrupts the progress of takeResult.
    • buildTransforms

      protected abstract void buildTransforms(org.apache.beam.model.pipeline.v1.RunnerApi.Components.Builder componentsBuilder)
    • getTimers

      protected abstract List<org.apache.beam.runners.core.construction.graph.TimerReference> getTimers(org.apache.beam.model.pipeline.v1.RunnerApi.Components components)
    • getOptionalTimerCoderProto

      protected abstract Optional<org.apache.beam.model.pipeline.v1.RunnerApi.Coder> getOptionalTimerCoderProto()
    • createJobBundleFactory

      @VisibleForTesting public org.apache.beam.runners.fnexecution.control.JobBundleFactory createJobBundleFactory(org.apache.beam.vendor.grpc.v1p60p1.com.google.protobuf.Struct pipelineOptions) throws Exception
      Throws:
      Exception