Class AbstractPythonScalarFunctionOperator

java.lang.Object
org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
All Implemented Interfaces:
Serializable, org.apache.flink.api.common.state.CheckpointListener, org.apache.flink.streaming.api.operators.BoundedOneInput, org.apache.flink.streaming.api.operators.Input<org.apache.flink.table.data.RowData>, org.apache.flink.streaming.api.operators.KeyContext, org.apache.flink.streaming.api.operators.KeyContextHandler, org.apache.flink.streaming.api.operators.OneInputStreamOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>, org.apache.flink.streaming.api.operators.StreamOperator<org.apache.flink.table.data.RowData>, org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.CheckpointedStreamOperator, org.apache.flink.streaming.api.operators.YieldingOperator<org.apache.flink.table.data.RowData>
Direct Known Subclasses:
ArrowPythonScalarFunctionOperator, PythonScalarFunctionOperator

@Internal public abstract class AbstractPythonScalarFunctionOperator extends AbstractStatelessFunctionOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>
Base class for all stream operators to execute Python ScalarFunctions. It executes the Python ScalarFunctions in separate Python execution environment.

The inputs are assumed as the following format: {{{ +------------------+--------------+ | forwarded fields | extra fields | +------------------+--------------+ }}}.

The Python UDFs may take input columns directly from the input row or the execution result of Java UDFs: 1) The input columns from the input row can be referred from the 'forwarded fields'; 2) The Java UDFs will be computed and the execution results can be referred from the 'extra fields'.

The outputs will be as the following format: {{{ +------------------+-------------------------+ | forwarded fields | scalar function results | +------------------+-------------------------+ }}}.

See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected org.apache.flink.table.data.utils.JoinedRowData
    The JoinedRowData reused holding the execution result.
    The collector used to collect records.
    protected final org.apache.flink.table.functions.python.PythonFunctionInfo[]
    The Python ScalarFunctions to be executed.

    Fields inherited from class org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator

    bais, baisWrapper, baos, baosWrapper, forwardedInputQueue, inputType, udfInputType, udfOutputType

    Fields inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator

    pythonFunctionRunner

    Fields inherited from class org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator

    bundleFinishedCallback, config, elementCount, lastFinishBundleTime, maxBundleSize, systemEnvEnabled

    Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator

    combinedWatermark, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
  • Constructor Summary

    Constructors
    Constructor
    Description
    AbstractPythonScalarFunctionOperator(org.apache.flink.configuration.Configuration config, org.apache.flink.table.functions.python.PythonFunctionInfo[] scalarFunctions, org.apache.flink.table.types.logical.RowType inputType, org.apache.flink.table.types.logical.RowType udfInputType, org.apache.flink.table.types.logical.RowType udfOutputType, org.apache.flink.table.runtime.generated.GeneratedProjection udfInputGeneratedProjection, org.apache.flink.table.runtime.generated.GeneratedProjection forwardedFieldGeneratedProjection)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    bufferInput(org.apache.flink.table.data.RowData input)
    Buffers the specified input, it will be used to construct the operator result together with the user-defined function execution result.
    Gets the proto representation of the Python user-defined functions to be executed.
    org.apache.flink.table.data.RowData
    getFunctionInput(org.apache.flink.table.data.RowData element)
     
     
    org.apache.flink.table.functions.python.PythonEnv
    Returns the PythonEnv used to create PythonEnvironmentManager..
    void
     

    Methods inherited from class org.apache.flink.table.runtime.operators.python.AbstractOneInputPythonFunctionOperator

    endInput

    Methods inherited from class org.apache.flink.streaming.api.operators.python.process.AbstractExternalPythonFunctionOperator

    close, createPythonEnvironmentManager, drainUnregisteredTimers, emitResult, emitResults, invokeFinishBundle

    Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator

    beforeInitializeStateHandler, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, initializeState, isAsyncStateProcessingEnabled, isUsingCustomRawKeyedState, notifyCheckpointAborted, notifyCheckpointComplete, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark1, processWatermark1, processWatermark2, processWatermark2, processWatermarkStatus, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, setup, snapshotState, snapshotState, useSplittableTimers

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener

    notifyCheckpointAborted, notifyCheckpointComplete

    Methods inherited from interface org.apache.flink.streaming.api.operators.Input

    processLatencyMarker, processRecordAttributes, processWatermark, processWatermark, processWatermarkStatus

    Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext

    getCurrentKey, setCurrentKey

    Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler

    hasKeyContext

    Methods inherited from interface org.apache.flink.streaming.api.operators.OneInputStreamOperator

    setKeyContextElement

    Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator

    close, finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
  • Field Details

    • scalarFunctions

      protected final org.apache.flink.table.functions.python.PythonFunctionInfo[] scalarFunctions
      The Python ScalarFunctions to be executed.
    • rowDataWrapper

      protected transient StreamRecordRowDataWrappingCollector rowDataWrapper
      The collector used to collect records.
    • reuseJoinedRow

      protected transient org.apache.flink.table.data.utils.JoinedRowData reuseJoinedRow
      The JoinedRowData reused holding the execution result.
  • Constructor Details

    • AbstractPythonScalarFunctionOperator

      public AbstractPythonScalarFunctionOperator(org.apache.flink.configuration.Configuration config, org.apache.flink.table.functions.python.PythonFunctionInfo[] scalarFunctions, org.apache.flink.table.types.logical.RowType inputType, org.apache.flink.table.types.logical.RowType udfInputType, org.apache.flink.table.types.logical.RowType udfOutputType, org.apache.flink.table.runtime.generated.GeneratedProjection udfInputGeneratedProjection, org.apache.flink.table.runtime.generated.GeneratedProjection forwardedFieldGeneratedProjection)
  • Method Details

    • open

      public void open() throws Exception
      Specified by:
      open in interface org.apache.flink.streaming.api.operators.StreamOperator<org.apache.flink.table.data.RowData>
      Overrides:
      open in class AbstractStatelessFunctionOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>
      Throws:
      Exception
    • getPythonEnv

      public org.apache.flink.table.functions.python.PythonEnv getPythonEnv()
      Description copied from class: AbstractExternalPythonFunctionOperator
      Returns the PythonEnv used to create PythonEnvironmentManager..
      Specified by:
      getPythonEnv in class AbstractExternalPythonFunctionOperator<org.apache.flink.table.data.RowData>
    • createUserDefinedFunctionsProto

      public FlinkFnApi.UserDefinedFunctions createUserDefinedFunctionsProto()
      Gets the proto representation of the Python user-defined functions to be executed.
      Specified by:
      createUserDefinedFunctionsProto in class AbstractStatelessFunctionOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>
    • getFunctionUrn

      public String getFunctionUrn()
      Specified by:
      getFunctionUrn in class AbstractStatelessFunctionOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>
    • bufferInput

      public void bufferInput(org.apache.flink.table.data.RowData input)
      Description copied from class: AbstractStatelessFunctionOperator
      Buffers the specified input, it will be used to construct the operator result together with the user-defined function execution result.
      Specified by:
      bufferInput in class AbstractStatelessFunctionOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>
    • getFunctionInput

      public org.apache.flink.table.data.RowData getFunctionInput(org.apache.flink.table.data.RowData element)
      Specified by:
      getFunctionInput in class AbstractStatelessFunctionOperator<org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData,org.apache.flink.table.data.RowData>