Class FlinkRuntimeFilterProgram

java.lang.Object
org.apache.flink.table.planner.plan.optimize.program.FlinkRuntimeFilterProgram

public class FlinkRuntimeFilterProgram extends Object
Planner program that tries to inject runtime filter for suitable join to improve join performance.

We build the runtime filter in a two-phase manner: First, each subtask on the build side builds a local filter based on its local data, and sends the built filter to a global aggregation node. Then the global aggregation node aggregates the received filters into a global filter, and sends the global filter to all probe side subtasks. Therefore, we will add BatchPhysicalLocalRuntimeFilterBuilder, BatchPhysicalGlobalRuntimeFilterBuilder and BatchPhysicalRuntimeFilter into the physical plan.

For example, for the following query:

SELECT * FROM fact, dim WHERE x = a AND z = 2

The original physical plan:


 Calc(select=[a, b, c, x, y, CAST(2 AS BIGINT) AS z])
 +- HashJoin(joinType=[InnerJoin], where=[=(x, a)], select=[a, b, c, x, y], build=[right])
    :- Exchange(distribution=[hash[a]])
    :  +- TableSourceScan(table=[[fact]], fields=[a, b, c])
    +- Exchange(distribution=[hash[x]])
       +- Calc(select=[x, y], where=[=(z, 2)])
          +- TableSourceScan(table=[[dim, filter=[]]], fields=[x, y, z])
 

This optimized physical plan:


 Calc(select=[a, b, c, x, y, CAST(2 AS BIGINT) AS z])
 +- HashJoin(joinType=[InnerJoin], where=[=(x, a)], select=[a, b, c, x, y], build=[right])
    :- Exchange(distribution=[hash[a]])
    :  +- RuntimeFilter(select=[a])
    :     :- Exchange(distribution=[broadcast])
    :     :  +- GlobalRuntimeFilterBuilder
    :     :     +- Exchange(distribution=[single])
    :     :        +- LocalRuntimeFilterBuilder(select=[x])
    :     :           +- Calc(select=[x, y], where=[=(z, 2)])
    :     :              +- TableSourceScan(table=[[dim, filter=[]]], fields=[x, y, z])
    :     +- TableSourceScan(table=[[fact]], fields=[a, b, c])
    +- Exchange(distribution=[hash[x]])
       +- Calc(select=[x, y], where=[=(z, 2)])
          +- TableSourceScan(table=[[dim, filter=[]]], fields=[x, y, z])

 
  • Constructor Details

    • FlinkRuntimeFilterProgram

      public FlinkRuntimeFilterProgram()
  • Method Details

    • optimize

      public org.apache.calcite.rel.RelNode optimize(org.apache.calcite.rel.RelNode root, BatchOptimizeContext context)
    • isSuitableJoinType

      public static boolean isSuitableJoinType(org.apache.calcite.rel.core.JoinRelType joinType)