Class KeyedStream<T,KEY>
- Type Parameters:
T- The type of the elements in the Keyed Stream.KEY- The type of the key in the Keyed Stream.
KeyedStream represents a DataStream on which operator state is partitioned by
key using a provided KeySelector. Typical operations supported by a DataStream
are also possible on a KeyedStream, with the exception of partitioning methods such as
shuffle, forward and keyBy.
Reduce-style operations, such as reduce(org.apache.flink.api.common.functions.ReduceFunction<T>), and sum(int) work on elements that have
the same key.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classPerform a join over a time interval.static classIntervalJoined is a container for two streams that have keys for both sides as well as the time boundaries over which elements should be joined.Nested classes/interfaces inherited from class org.apache.flink.streaming.api.datastream.DataStream
DataStream.Collector<T> -
Field Summary
Fields inherited from class org.apache.flink.streaming.api.datastream.DataStream
environment, transformation -
Constructor Summary
ConstructorsConstructorDescriptionKeyedStream(DataStream<T> dataStream, org.apache.flink.api.java.functions.KeySelector<T, KEY> keySelector) Creates a newKeyedStreamusing the givenKeySelectorto partition operator state by key.KeyedStream(DataStream<T> dataStream, org.apache.flink.api.java.functions.KeySelector<T, KEY> keySelector, org.apache.flink.api.common.typeinfo.TypeInformation<KEY> keyType) Creates a newKeyedStreamusing the givenKeySelectorto partition operator state by key. -
Method Summary
Modifier and TypeMethodDescriptionaddSink(SinkFunction<T> sinkFunction) Adds the given sink to this DataStream.protected SingleOutputStreamOperator<T>aggregate(AggregationFunction<T> aggregate) asQueryableState(String queryableStateName) Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.asQueryableState(String queryableStateName, org.apache.flink.api.common.state.ReducingStateDescriptor<T> stateDescriptor) Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.asQueryableState(String queryableStateName, org.apache.flink.api.common.state.ValueStateDescriptor<T> stateDescriptor) Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.countWindow(long size) Windows thisKeyedStreaminto tumbling count windows.countWindow(long size, long slide) Windows thisKeyedStreaminto sliding count windows.protected <R> SingleOutputStreamOperator<R>doTransform(String operatorName, org.apache.flink.api.common.typeinfo.TypeInformation<R> outTypeInfo, StreamOperatorFactory<R> operatorFactory) Enable the async state processing for following keyed processing function.<R> SingleOutputStreamOperator<R>flatMap(org.apache.flink.api.common.functions.FlatMapFunction<T, R> flatMapper, org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType) Applies a FlatMap transformation on aDataStream.Collect records from each partition into a separate full window.Gets the key selector that can get the key by which the stream if partitioned from the elements.org.apache.flink.api.common.typeinfo.TypeInformation<KEY>Gets the type of the key by which the stream is partitioned.<T1> KeyedStream.IntervalJoin<T,T1, KEY> intervalJoin(KeyedStream<T1, KEY> otherStream) Join elements of thisKeyedStreamwith elements of anotherKeyedStreamover a time interval that can be specified withKeyedStream.IntervalJoin.between(Duration, Duration).max(int positionToMax) Applies an aggregation that gives the current maximum of the data stream at the given position by the given key.Applies an aggregation that gives the current maximum of the data stream at the given field expression by the given key.maxBy(int positionToMaxBy) Applies an aggregation that gives the current element with the maximum value at the given position by the given key.maxBy(int positionToMaxBy, boolean first) Applies an aggregation that gives the current element with the maximum value at the given position by the given key.Applies an aggregation that gives the current element with the maximum value at the given position by the given key.Applies an aggregation that gives the current maximum element of the data stream by the given field expression by the given key.min(int positionToMin) Applies an aggregation that gives the current minimum of the data stream at the given position by the given key.Applies an aggregation that gives the current minimum of the data stream at the given field expression by the given key.minBy(int positionToMinBy) Applies an aggregation that gives the current element with the minimum value at the given position by the given key.minBy(int positionToMinBy, boolean first) Applies an aggregation that gives the current element with the minimum value at the given position by the given key.Applies an aggregation that gives the current element with the minimum value at the given position by the given key.Applies an aggregation that gives the current minimum element of the data stream by the given field expression by the given key.<R> SingleOutputStreamOperator<R>process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) Applies the givenKeyedProcessFunctionon the input stream, thereby creating a transformed output stream.<R> SingleOutputStreamOperator<R>process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction, org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType) Applies the givenKeyedProcessFunctionon the input stream, thereby creating a transformed output stream.Applies a reduce transformation on the grouped data stream grouped on by the given key position.protected DataStream<T>setConnectionType(StreamPartitioner<T> partitioner) Internal function for setting the partitioner for the DataStream.sum(int positionToSum) Applies an aggregation that gives a rolling sum of the data stream at the given position grouped by the given key.Applies an aggregation that gives the current sum of the data stream at the given field by the given key.<W extends Window>
WindowedStream<T,KEY, W> window(WindowAssigner<? super T, W> assigner) Windows this data stream to aWindowedStream, which evaluates windows over a key grouped stream.Methods inherited from class org.apache.flink.streaming.api.datastream.DataStream
assignTimestampsAndWatermarks, broadcast, broadcast, clean, coGroup, collectAsync, collectAsync, connect, connect, countWindowAll, countWindowAll, executeAndCollect, executeAndCollect, executeAndCollect, executeAndCollect, filter, flatMap, forward, getExecutionConfig, getExecutionEnvironment, getId, getMinResources, getParallelism, getPreferredResources, getTransformation, getType, global, join, keyBy, keyBy, keyBy, map, map, partitionCustom, print, print, printToErr, printToErr, process, process, project, rebalance, rescale, shuffle, sinkTo, sinkTo, transform, transform, union, windowAll, writeToSocket, writeUsingOutputFormat
-
Constructor Details
-
KeyedStream
public KeyedStream(DataStream<T> dataStream, org.apache.flink.api.java.functions.KeySelector<T, KEY> keySelector) Creates a newKeyedStreamusing the givenKeySelectorto partition operator state by key.- Parameters:
dataStream- Base stream of datakeySelector- Function for determining state partitions
-
KeyedStream
public KeyedStream(DataStream<T> dataStream, org.apache.flink.api.java.functions.KeySelector<T, KEY> keySelector, org.apache.flink.api.common.typeinfo.TypeInformation<KEY> keyType) Creates a newKeyedStreamusing the givenKeySelectorto partition operator state by key.- Parameters:
dataStream- Base stream of datakeySelector- Function for determining state partitions
-
-
Method Details
-
getKeySelector
Gets the key selector that can get the key by which the stream if partitioned from the elements.- Returns:
- The key selector for the key.
-
getKeyType
Gets the type of the key by which the stream is partitioned.- Returns:
- The type of the key by which the stream is partitioned.
-
setConnectionType
Description copied from class:DataStreamInternal function for setting the partitioner for the DataStream.- Overrides:
setConnectionTypein classDataStream<T>- Parameters:
partitioner- Partitioner to set.- Returns:
- The modified DataStream.
-
doTransform
protected <R> SingleOutputStreamOperator<R> doTransform(String operatorName, org.apache.flink.api.common.typeinfo.TypeInformation<R> outTypeInfo, StreamOperatorFactory<R> operatorFactory) - Overrides:
doTransformin classDataStream<T>
-
addSink
Description copied from class:DataStreamAdds the given sink to this DataStream. Only streams with sinks added will be executed once theStreamExecutionEnvironment.execute()method is called.- Overrides:
addSinkin classDataStream<T>- Parameters:
sinkFunction- The object containing the sink's invoke function.- Returns:
- The closed DataStream.
-
process
@PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) Applies the givenKeyedProcessFunctionon the input stream, thereby creating a transformed output stream.The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the
DataStream.flatMap(FlatMapFunction)function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.- Type Parameters:
R- The type of elements emitted by theKeyedProcessFunction.- Parameters:
keyedProcessFunction- TheKeyedProcessFunctionthat is called for each element in the stream.- Returns:
- The transformed
DataStream.
-
process
@Internal public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction, org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType) Applies the givenKeyedProcessFunctionon the input stream, thereby creating a transformed output stream.The function will be called for every element in the input streams and can produce zero or more output elements. Contrary to the
DataStream.flatMap(FlatMapFunction)function, this function can also query the time and set timers. When reacting to the firing of set timers the function can directly emit elements and/or register yet more timers.- Type Parameters:
R- The type of elements emitted by theKeyedProcessFunction.- Parameters:
keyedProcessFunction- TheKeyedProcessFunctionthat is called for each element in the stream.outputType-TypeInformationfor the result type of the function.- Returns:
- The transformed
DataStream.
-
flatMap
public <R> SingleOutputStreamOperator<R> flatMap(org.apache.flink.api.common.functions.FlatMapFunction<T, R> flatMapper, org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType) Description copied from class:DataStreamApplies a FlatMap transformation on aDataStream. The transformation calls aFlatMapFunctionfor each element of the DataStream. Each FlatMapFunction call can return any number of elements including none. The user can also extendRichFlatMapFunctionto gain access to other features provided by theRichFunctioninterface.- Overrides:
flatMapin classDataStream<T>- Type Parameters:
R- output type- Parameters:
flatMapper- The FlatMapFunction that is called for each element of the DataStreamoutputType-TypeInformationfor the result type of the function.- Returns:
- The transformed
DataStream.
-
intervalJoin
@PublicEvolving public <T1> KeyedStream.IntervalJoin<T,T1, intervalJoinKEY> (KeyedStream<T1, KEY> otherStream) Join elements of thisKeyedStreamwith elements of anotherKeyedStreamover a time interval that can be specified withKeyedStream.IntervalJoin.between(Duration, Duration).- Type Parameters:
T1- Type parameter of elements in the other stream- Parameters:
otherStream- The other keyed stream to join this keyed stream with- Returns:
- An instance of
KeyedStream.IntervalJoinwith this keyed stream and the other keyed stream
-
countWindow
Windows thisKeyedStreaminto tumbling count windows.- Parameters:
size- The size of the windows in number of elements.
-
countWindow
Windows thisKeyedStreaminto sliding count windows.- Parameters:
size- The size of the windows in number of elements.slide- The slide interval in number of elements.
-
window
@PublicEvolving public <W extends Window> WindowedStream<T,KEY, windowW> (WindowAssigner<? super T, W> assigner) Windows this data stream to aWindowedStream, which evaluates windows over a key grouped stream. Elements are put into windows by aWindowAssigner. The grouping of elements is done both by key and by window.A
Triggercan be defined to specify when windows are evaluated. However,WindowAssignershave a defaultTriggerthat is used if aTriggeris not specified.- Parameters:
assigner- TheWindowAssignerthat assigns elements to windows.- Returns:
- The trigger windows data stream.
-
reduce
public SingleOutputStreamOperator<T> reduce(org.apache.flink.api.common.functions.ReduceFunction<T> reducer) Applies a reduce transformation on the grouped data stream grouped on by the given key position. TheReduceFunctionwill receive input values based on the key value. Only input values with the same key will go to the same reducer.- Parameters:
reducer- TheReduceFunctionthat will be called for every element of the input values with the same key.- Returns:
- The transformed DataStream.
-
sum
Applies an aggregation that gives a rolling sum of the data stream at the given position grouped by the given key. An independent aggregate is kept per key.- Parameters:
positionToSum- The field position in the data points to sum. This is applicable to Tuple types, basic and primitive array types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
sum
Applies an aggregation that gives the current sum of the data stream at the given field by the given key. An independent aggregate is kept per key.- Parameters:
field- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
min
Applies an aggregation that gives the current minimum of the data stream at the given position by the given key. An independent aggregate is kept per key.- Parameters:
positionToMin- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
min
Applies an aggregation that gives the current minimum of the data stream at the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy".- Parameters:
field- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
max
Applies an aggregation that gives the current maximum of the data stream at the given position by the given key. An independent aggregate is kept per key.- Parameters:
positionToMax- The field position in the data points to maximize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
max
Applies an aggregation that gives the current maximum of the data stream at the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy".- Parameters:
field- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
minBy
Applies an aggregation that gives the current minimum element of the data stream by the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy".- Parameters:
field- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).first- If True then in case of field equality the first object will be returned- Returns:
- The transformed DataStream.
-
maxBy
Applies an aggregation that gives the current maximum element of the data stream by the given field expression by the given key. An independent aggregate is kept per key. A field expression is either the name of a public field or a getter method with parentheses of theDataStream's underlying type. A dot can be used to drill down into objects, as in"field1.fieldxy".- Parameters:
field- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).first- If True then in case of field equality the first object will be returned- Returns:
- The transformed DataStream.
-
minBy
Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns the first one by default.- Parameters:
positionToMinBy- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
minBy
Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns the first one by default.- Parameters:
positionToMinBy- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
minBy
Applies an aggregation that gives the current element with the minimum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the minimum value at the given position, the operator returns either the first or last one, depending on the parameter set.- Parameters:
positionToMinBy- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).first- If true, then the operator return the first element with the minimal value, otherwise returns the last- Returns:
- The transformed DataStream.
-
maxBy
Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns the first one by default.- Parameters:
positionToMaxBy- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).- Returns:
- The transformed DataStream.
-
maxBy
Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns the first one by default.- Parameters:
positionToMaxBy- In case of a POJO, Scala case class, or Tuple type, the name of the (public) field on which to perform the aggregation. Additionally, a dot can be used to drill down into nested objects, as in"field1.fieldxy". Furthermore "*" can be specified in case of a basic type (which is considered as having only one field).- Returns:
- The transformed DataStream.
-
maxBy
Applies an aggregation that gives the current element with the maximum value at the given position by the given key. An independent aggregate is kept per key. If more elements have the maximum value at the given position, the operator returns either the first or last one, depending on the parameter set.- Parameters:
positionToMaxBy- The field position in the data points to minimize. This is applicable to Tuple types, Scala case classes, and primitive types (which is considered as having one field).first- If true, then the operator return the first element with the maximum value, otherwise returns the last- Returns:
- The transformed DataStream.
-
aggregate
-
fullWindowPartition
Collect records from each partition into a separate full window. The window emission will be triggered at the end of inputs. For this keyed data stream(each record has a key), a partition only contains all records with the same key.- Overrides:
fullWindowPartitionin classDataStream<T>- Returns:
- The full windowed data stream on partition.
-
asQueryableState
@PublicEvolving @Deprecated public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName) Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.Publishes the keyed stream as queryable ValueState instance.- Parameters:
queryableStateName- Name under which to the publish the queryable state instance- Returns:
- Queryable state instance
-
asQueryableState
@PublicEvolving @Deprecated public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName, org.apache.flink.api.common.state.ValueStateDescriptor<T> stateDescriptor) Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.Publishes the keyed stream as a queryable ValueState instance.- Parameters:
queryableStateName- Name under which to the publish the queryable state instancestateDescriptor- State descriptor to create state instance from- Returns:
- Queryable state instance
-
asQueryableState
@PublicEvolving @Deprecated public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName, org.apache.flink.api.common.state.ReducingStateDescriptor<T> stateDescriptor) Deprecated.The Queryable State feature is deprecated since Flink 1.18, and will be removed in a future Flink major version.Publishes the keyed stream as a queryable ReducingState instance.- Parameters:
queryableStateName- Name under which to the publish the queryable state instancestateDescriptor- State descriptor to create state instance from- Returns:
- Queryable state instance
-
enableAsyncState
Enable the async state processing for following keyed processing function. This also requires only State V2 APIs are used in the function.- Returns:
- the configured KeyedStream itself.
-