Class ScanReuser

java.lang.Object
org.apache.flink.table.planner.plan.reuse.ScanReuser

public class ScanReuser extends Object
Reuse sources.

When there are projection and metadata push down, the generated source cannot be reused because of the difference of digest. To make source reusable, this class does the following:

  • First, find the same source, regardless of their projection and metadata push down.
  • Union projections for different instances of the same source and create a new instance.
  • Generate different Calc nodes for different instances.
  • Replace instances.

For example, plan:


 Calc(select=[a, b, c])
 +- Join(joinType=[InnerJoin], where=[(a = a0)], select=[a, b, a0, c])
    :- Exchange(distribution=[hash[a]])
    :  +- TableSourceScan(table=[[MyTable, project=[a, b]]], fields=[a, b])
    +- Exchange(distribution=[hash[a]])
    :  +- TableSourceScan(table=[[MyTable, project=[a, c]]], fields=[a, c])
 

Unified to:


 Calc(select=[a, b, c])
 +- Join(joinType=[InnerJoin], where=[(a = a0)], select=[a, b, a0, c])
    :- Exchange(distribution=[hash[a]])
    :  +- Calc(select=[a, b])
    :     +- TableSourceScan(table=[[MyTable, project=[a, b, c]]], fields=[a, b, c])
    +- Exchange(distribution=[hash[a]])
       +- Calc(select=[a, c])
    :     +- TableSourceScan(table=[[MyTable, project=[a, b, c]]], fields=[a, b, c])
 

This class do not reuse all sources, sources with same digest will be reused by SubplanReuser.

NOTE: This class not optimize expressions like "$0.child" and "$0", keep both. But PushProjectIntoTableSourceScanRule will reduce them to only one projection "$0". This is because the subsequent rewrite of watermark push down will become very troublesome. Not only need to adjust the index, but also generate the getter of the nested field. So, connector must deal with "$0.child" and "$0" projection.

  • Constructor Details

    • ScanReuser

      public ScanReuser(FlinkContext flinkContext, FlinkTypeFactory flinkTypeFactory)
  • Method Details

    • reuseDuplicatedScan

      public List<org.apache.calcite.rel.RelNode> reuseDuplicatedScan(List<org.apache.calcite.rel.RelNode> relNodes)