Interface OpFusionCodegenSpec
-
Method Summary
Modifier and TypeMethodDescriptiondoEndInputConsume(int inputId) The endInput method is used to do clean work for operator corresponding input, such as the HashAgg operator needs to flush data, and the HashJoin build side need to build hash table, so each operator needs to implement the corresponding clean logic in this method.voiddoEndInputProduce(CodeGeneratorContext codegenCtx) Generate the Java source code to do operator clean work, only the leaf operator in operator DAG need to generate the code, other middle operators just call its input `endInputProduce` normally, otherwise, the operator has some specific logic.doProcessConsume(int inputId, List<GeneratedExpression> inputVars, GeneratedExpression row) The process method is responsible for the operator data processing logic, so each operator needs to implement this method to generate the code to process the row.voiddoProcessProduce(CodeGeneratorContext codegenCtx) Generate the Java source code to process rows, only the leaf operator in operator DAG need to generate the code which produce the row, other middle operators just call its inputOpFusionCodegenSpecGenerator.processProduce(CodeGeneratorContext)normally, otherwise, the operator has some specific logic.CodeGeneratorContextEvery operator need oneCodeGeneratorContextto store the context needed during operator fusion codegen.ExprCodeGeneratorGet theExprCodeGeneratorused by this operator during operator fusion codegen, .Class<? extends org.apache.flink.table.data.RowData>getInputRowDataClass(int inputId) Specific inputId of current operator neededRowDatatype, this is used to notify the upstream operator wrap the properRowDatawe needed before call doProcessConsume method.voidsetup(OpFusionContext opFusionContext) Initializes the operator spec.usedInputColumns(int inputId) The subset of column index those should be evaluated before this operator.Prefix used in the current operator's variable names.
-
Method Details
-
setup
Initializes the operator spec. Sets access to the context. This method must be called before doProduce and doConsume related methods. -
variablePrefix
String variablePrefix()Prefix used in the current operator's variable names. -
usedInputColumns
The subset of column index those should be evaluated before this operator.We will use this to insert some code to access those columns that are actually used by current operator before calling doProcessConsume().
-
getInputRowDataClass
Specific inputId of current operator neededRowDatatype, this is used to notify the upstream operator wrap the properRowDatawe needed before call doProcessConsume method. For example, HashJoin build side needBinaryRowData. -
getCodeGeneratorContext
CodeGeneratorContext getCodeGeneratorContext()Every operator need oneCodeGeneratorContextto store the context needed during operator fusion codegen. -
getExprCodeGenerator
ExprCodeGenerator getExprCodeGenerator()Get theExprCodeGeneratorused by this operator during operator fusion codegen, . -
doProcessProduce
void doProcessProduce(CodeGeneratorContext codegenCtx) Generate the Java source code to process rows, only the leaf operator in operator DAG need to generate the code which produce the row, other middle operators just call its inputOpFusionCodegenSpecGenerator.processProduce(CodeGeneratorContext)normally, otherwise, the operator has some specific logic. The leaf operator produce row first, and then callOpFusionContext.processConsume(List)method to consume row.The code generated by leaf operator will be saved in fusionCtx, so this method doesn't has return type.
-
doProcessConsume
The process method is responsible for the operator data processing logic, so each operator needs to implement this method to generate the code to process the row. This should only be called fromOpFusionCodegenSpecGenerator.processConsume(List, String).Note: A operator can either consume the rows as RowData (row), or a list of variables (inputVars).
- Parameters:
inputId- This is numbered starting from 1, and `1` indicates the first input.inputVars- field variables of current input.row- row variable of current input.
-
doEndInputProduce
void doEndInputProduce(CodeGeneratorContext codegenCtx) Generate the Java source code to do operator clean work, only the leaf operator in operator DAG need to generate the code, other middle operators just call its input `endInputProduce` normally, otherwise, the operator has some specific logic.The code generated by leaf operator will be saved in fusionCtx, so this method doesn't has return type.
-
doEndInputConsume
The endInput method is used to do clean work for operator corresponding input, such as the HashAgg operator needs to flush data, and the HashJoin build side need to build hash table, so each operator needs to implement the corresponding clean logic in this method.For blocking operators such as HashAgg, the
OpFusionContext.processConsume(List, String)method needs to be called first to consume the data, followed by the `endInputConsume` method to do the cleanup work of the downstream operators. For pipeline operators such as Project, you only need to call the `endInputConsume` method.- Parameters:
inputId- This is numbered starting from 1, and `1` indicates the first input.
-