Class AsyncIntervalJoinOperator<K,T1,T2,OUT>

Type Parameters:
K - The type of the key based on which we join elements.
T1 - The type of the elements in the left stream.
T2 - The type of the elements in the right stream.
OUT - The output type created by the user-defined function.
All Implemented Interfaces:
Serializable, org.apache.flink.api.common.state.CheckpointListener, KeyContext, KeyContextHandler, org.apache.flink.streaming.api.operators.OutputTypeConfigurable<OUT>, StreamOperator<OUT>, StreamOperatorStateHandler.CheckpointedStreamOperator, Triggerable<K,String>, TwoInputStreamOperator<T1,T2,OUT>, UserFunctionProvider<ProcessJoinFunction<T1,T2,OUT>>, YieldingOperator<OUT>, AsyncStateProcessing, AsyncStateProcessingOperator

@Internal public class AsyncIntervalJoinOperator<K,T1,T2,OUT> extends AbstractAsyncStateUdfStreamOperator<OUT,ProcessJoinFunction<T1,T2,OUT>> implements TwoInputStreamOperator<T1,T2,OUT>, Triggerable<K,String>
An operator to execute time-bounded stream inner joins. This is the async state access version of IntervalJoinOperator.

By using a configurable lower and upper bound this operator will emit exactly those pairs (T1, T2) where t2.ts ∈ [T1.ts + lowerBound, T1.ts + upperBound]. Both the lower and the upper bound can be configured to be either inclusive or exclusive.

As soon as elements are joined they are passed to a user-defined ProcessJoinFunction.

The basic idea of this implementation is as follows: Whenever we receive an element at processElement1(StreamRecord) (a.k.a. the left side), we add it to the left buffer. We then check the right buffer to see whether there are any elements that can be joined. If there are, they are joined and passed to the aforementioned function. The same happens the other way around when receiving an element on the right side.

Whenever a pair of elements is emitted it will be assigned the max timestamp of either of the elements.

In order to avoid the element buffers to grow indefinitely a cleanup timer is registered per element. This timer indicates when an element is not considered for joining anymore and can be removed from the state.

See Also:
  • Constructor Details

    • AsyncIntervalJoinOperator

      public AsyncIntervalJoinOperator(long lowerBound, long upperBound, boolean lowerBoundInclusive, boolean upperBoundInclusive, org.apache.flink.util.OutputTag<T1> leftLateDataOutputTag, org.apache.flink.util.OutputTag<T2> rightLateDataOutputTag, org.apache.flink.api.common.typeutils.TypeSerializer<T1> leftTypeSerializer, org.apache.flink.api.common.typeutils.TypeSerializer<T2> rightTypeSerializer, ProcessJoinFunction<T1,T2,OUT> udf)
      Creates a new IntervalJoinOperator.
      Parameters:
      lowerBound - The lower bound for evaluating if elements should be joined
      upperBound - The upper bound for evaluating if elements should be joined
      lowerBoundInclusive - Whether or not to include elements where the timestamp matches the lower bound
      upperBoundInclusive - Whether or not to include elements where the timestamp matches the upper bound
      udf - A user-defined ProcessJoinFunction that gets called whenever two elements of T1 and T2 are joined
  • Method Details