Interface KStream<K,V>
-
- Type Parameters:
K- Type of keysV- Type of values
@Evolving public interface KStream<K,V>KStreamis an abstraction of a record stream ofKeyValuepairs, i.e., each record is an independent entity/event in the real world. For example a user X might buy two items I1 and I2, and thus there might be two records<K:I1>, <K:I2>in the stream.A
KStreamis eitherdefined from one or multiple Kafka topicsthat are consumed message by message or the result of aKStreamtransformation. AKTablecan also beconvertedinto aKStream.A
KStreamcan be transformed record by record, joined with anotherKStream,KTable,GlobalKTable, or can be aggregated into aKTable. Kafka Streams DSL can be mixed-and-matched with Processor API (PAPI) (c.f.Topology) viaprocess(...),transform(...), andtransformValues(...).- See Also:
KTable,KGroupedStream,StreamsBuilder.stream(String)
-
-
Method Summary
All Methods Instance Methods Abstract Methods Deprecated Methods Modifier and Type Method Description KStream<K,V>[]branch(Predicate<? super K,? super V>... predicates)Creates an array ofKStreamfrom this stream by branching the records in the original stream based on the supplied predicates.KStream<K,V>filter(Predicate<? super K,? super V> predicate)Create a newKStreamthat consists of all records of this stream which satisfy the given predicate.KStream<K,V>filterNot(Predicate<? super K,? super V> predicate)Create a newKStreamthat consists all records of this stream which do not satisfy the given predicate.<KR,VR>
KStream<KR,VR>flatMap(KeyValueMapper<? super K,? super V,? extends java.lang.Iterable<? extends KeyValue<? extends KR,? extends VR>>> mapper)Transform each record of the input stream into zero or more records in the output stream (both key and value type can be altered arbitrarily).<VR> KStream<K,VR>flatMapValues(ValueMapper<? super V,? extends java.lang.Iterable<? extends VR>> mapper)Create a newKStreamby transforming the value of each record in this stream into zero or more values with the same key in the new stream.<VR> KStream<K,VR>flatMapValues(ValueMapperWithKey<? super K,? super V,? extends java.lang.Iterable<? extends VR>> mapper)Create a newKStreamby transforming the value of each record in this stream into zero or more values with the same key in the new stream.voidforeach(ForeachAction<? super K,? super V> action)Perform an action on each record ofKStream.<KR> KGroupedStream<KR,V>groupBy(KeyValueMapper<? super K,? super V,KR> selector)Group the records of thisKStreamon a new key that is selected using the providedKeyValueMapperand default serializers and deserializers.<KR> KGroupedStream<KR,V>groupBy(KeyValueMapper<? super K,? super V,KR> selector, org.apache.kafka.common.serialization.Serde<KR> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.<KR> KGroupedStream<KR,V>groupBy(KeyValueMapper<? super K,? super V,KR> selector, Serialized<KR,V> serialized)Group the records of thisKStreamon a new key that is selected using the providedKeyValueMapperandSerdes as specified bySerialized.KGroupedStream<K,V>groupByKey()Group the records by their current key into aKGroupedStreamwhile preserving the original values and default serializers and deserializers.KGroupedStream<K,V>groupByKey(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.KGroupedStream<K,V>groupByKey(Serialized<K,V> serialized)Group the records by their current key into aKGroupedStreamwhile preserving the original values and using the serializers as defined bySerialized.<GK,GV,RV>
KStream<K,RV>join(GlobalKTable<GK,GV> globalKTable, KeyValueMapper<? super K,? super V,? extends GK> keyValueMapper, ValueJoiner<? super V,? super GV,? extends RV> joiner)Join records of this stream withGlobalKTable's records using non-windowed inner equi join.<VO,VR>
KStream<K,VR>join(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows)Join records of this stream with anotherKStream's records using windowed inner equi join with default serializers and deserializers.<VO,VR>
KStream<K,VR>join(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> thisValueSerde, org.apache.kafka.common.serialization.Serde<VO> otherValueSerde)<VO,VR>
KStream<K,VR>join(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, Joined<K,V,VO> joined)Join records of this stream with anotherKStream's records using windowed inner equi join with default serializers and deserializers.<VT,VR>
KStream<K,VR>join(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner)Join records of this stream withKTable's records using non-windowed inner equi join with default serializers and deserializers.<VT,VR>
KStream<K,VR>join(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.<VT,VR>
KStream<K,VR>join(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, Joined<K,V,VT> joined)Join records of this stream withKTable's records using non-windowed inner equi join with default serializers and deserializers.<GK,GV,RV>
KStream<K,RV>leftJoin(GlobalKTable<GK,GV> globalKTable, KeyValueMapper<? super K,? super V,? extends GK> keyValueMapper, ValueJoiner<? super V,? super GV,? extends RV> valueJoiner)Join records of this stream withGlobalKTable's records using non-windowed left equi join.<VO,VR>
KStream<K,VR>leftJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows)Join records of this stream with anotherKStream's records using windowed left equi join with default serializers and deserializers.<VO,VR>
KStream<K,VR>leftJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> thisValSerde, org.apache.kafka.common.serialization.Serde<VO> otherValueSerde)<VO,VR>
KStream<K,VR>leftJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, Joined<K,V,VO> joined)Join records of this stream with anotherKStream's records using windowed left equi join with default serializers and deserializers.<VT,VR>
KStream<K,VR>leftJoin(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner)Join records of this stream withKTable's records using non-windowed left equi join with default serializers and deserializers.<VT,VR>
KStream<K,VR>leftJoin(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.<VT,VR>
KStream<K,VR>leftJoin(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, Joined<K,V,VT> joined)Join records of this stream withKTable's records using non-windowed left equi join with default serializers and deserializers.<KR,VR>
KStream<KR,VR>map(KeyValueMapper<? super K,? super V,? extends KeyValue<? extends KR,? extends VR>> mapper)Transform each record of the input stream into a new record in the output stream (both key and value type can be altered arbitrarily).<VR> KStream<K,VR>mapValues(ValueMapper<? super V,? extends VR> mapper)Transform the value of each input record into a new value (with possible new type) of the output record.<VR> KStream<K,VR>mapValues(ValueMapperWithKey<? super K,? super V,? extends VR> mapper)Transform the value of each input record into a new value (with possible new type) of the output record.KStream<K,V>merge(KStream<K,V> stream)Merge this stream and the given stream into one larger stream.<VO,VR>
KStream<K,VR>outerJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows)Join records of this stream with anotherKStream's records using windowed outer equi join with default serializers and deserializers.<VO,VR>
KStream<K,VR>outerJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> thisValueSerde, org.apache.kafka.common.serialization.Serde<VO> otherValueSerde)<VO,VR>
KStream<K,VR>outerJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, Joined<K,V,VO> joined)Join records of this stream with anotherKStream's records using windowed outer equi join with default serializers and deserializers.KStream<K,V>peek(ForeachAction<? super K,? super V> action)Perform an action on each record ofKStream.voidprint()Deprecated.useprint(Printed)voidprint(java.lang.String label)Deprecated.voidprint(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.voidprint(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String label)voidprint(KeyValueMapper<? super K,? super V,java.lang.String> mapper)Deprecated.voidprint(KeyValueMapper<? super K,? super V,java.lang.String> mapper, java.lang.String label)voidprint(KeyValueMapper<? super K,? super V,java.lang.String> mapper, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.voidprint(KeyValueMapper<? super K,? super V,java.lang.String> mapper, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String label)voidprint(Printed<K,V> printed)Print the records of this KStream using the options provided byPrintedvoidprocess(ProcessorSupplier<? super K,? super V> processorSupplier, java.lang.String... stateStoreNames)Process all records in this stream, one record at a time, by applying aProcessor(provided by the givenProcessorSupplier).<KR> KStream<KR,V>selectKey(KeyValueMapper<? super K,? super V,? extends KR> mapper)Set a new key (with possibly new type) for each input record.KStream<K,V>through(java.lang.String topic)Materialize this stream to a topic and creates a newKStreamfrom the topic using default serializers and deserializers and producer'sDefaultPartitioner.KStream<K,V>through(java.lang.String topic, Produced<K,V> produced)Materialize this stream to a topic and creates a newKStreamfrom the topic using theProducedinstance for configuration of thekey serde,value serde, andStreamPartitioner.KStream<K,V>through(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String topic)Deprecated.KStream<K,V>through(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)Deprecated.KStream<K,V>through(StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)Deprecated.voidto(java.lang.String topic)Materialize this stream to a topic using default serializers specified in the config and producer'sDefaultPartitioner.voidto(java.lang.String topic, Produced<K,V> produced)Materialize this stream to a topic using the providedProducedinstance.voidto(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String topic)Deprecated.voidto(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)Deprecated.voidto(StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)Deprecated.<K1,V1>
KStream<K1,V1>transform(TransformerSupplier<? super K,? super V,KeyValue<K1,V1>> transformerSupplier, java.lang.String... stateStoreNames)Transform each record of the input stream into zero or more records in the output stream (both key and value type can be altered arbitrarily).<VR> KStream<K,VR>transformValues(ValueTransformerSupplier<? super V,? extends VR> valueTransformerSupplier, java.lang.String... stateStoreNames)Transform the value of each input record into a new value (with possible new type) of the output record.<VR> KStream<K,VR>transformValues(ValueTransformerWithKeySupplier<? super K,? super V,? extends VR> valueTransformerSupplier, java.lang.String... stateStoreNames)Transform the value of each input record into a new value (with possible new type) of the output record.voidwriteAsText(java.lang.String filePath)Deprecated.voidwriteAsText(java.lang.String filePath, java.lang.String label)Deprecated.voidwriteAsText(java.lang.String filePath, java.lang.String label, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)voidwriteAsText(java.lang.String filePath, java.lang.String label, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, KeyValueMapper<? super K,? super V,java.lang.String> mapper)voidwriteAsText(java.lang.String filePath, java.lang.String label, KeyValueMapper<? super K,? super V,java.lang.String> mapper)voidwriteAsText(java.lang.String filePath, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.voidwriteAsText(java.lang.String filePath, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, KeyValueMapper<? super K,? super V,java.lang.String> mapper)Deprecated.voidwriteAsText(java.lang.String filePath, KeyValueMapper<? super K,? super V,java.lang.String> mapper)Deprecated.
-
-
-
Method Detail
-
filter
KStream<K,V> filter(Predicate<? super K,? super V> predicate)
Create a newKStreamthat consists of all records of this stream which satisfy the given predicate. All records that do not satisfy the predicate are dropped. This is a stateless record-by-record operation.- Parameters:
predicate- a filterPredicatethat is applied to each record- Returns:
- a
KStreamthat contains only those records that satisfy the given predicate - See Also:
filterNot(Predicate)
-
filterNot
KStream<K,V> filterNot(Predicate<? super K,? super V> predicate)
Create a newKStreamthat consists all records of this stream which do not satisfy the given predicate. All records that do satisfy the predicate are dropped. This is a stateless record-by-record operation.- Parameters:
predicate- a filterPredicatethat is applied to each record- Returns:
- a
KStreamthat contains only those records that do not satisfy the given predicate - See Also:
filter(Predicate)
-
selectKey
<KR> KStream<KR,V> selectKey(KeyValueMapper<? super K,? super V,? extends KR> mapper)
Set a new key (with possibly new type) for each input record. The providedKeyValueMapperis applied to each input record and computes a new key for it. Thus, an input record<K,V>can be transformed into an output record<K':V>. This is a stateless record-by-record operation.For example, you can use this transformation to set a key for a key-less input record
<null,V>by extracting a key from the value within yourKeyValueMapper. The example below computes the new key as the length of the value string.KStream<Byte[], String> keyLessStream = builder.stream("key-less-topic"); KStream<Integer, String> keyedStream = keyLessStream.selectKey(new KeyValueMapper<Byte[], String, Integer> { Integer apply(Byte[] key, String value) { return value.length(); } });Setting a new key might result in an internal data redistribution if a key based operator (like an aggregation or join) is applied to the result
KStream.- Type Parameters:
KR- the new key type of the result stream- Parameters:
mapper- aKeyValueMapperthat computes a new key for each record- Returns:
- a
KStreamthat contains records with new key (possibly of different type) and unmodified value - See Also:
map(KeyValueMapper),flatMap(KeyValueMapper),mapValues(ValueMapper),mapValues(ValueMapperWithKey),flatMapValues(ValueMapper),flatMapValues(ValueMapperWithKey)
-
map
<KR,VR> KStream<KR,VR> map(KeyValueMapper<? super K,? super V,? extends KeyValue<? extends KR,? extends VR>> mapper)
Transform each record of the input stream into a new record in the output stream (both key and value type can be altered arbitrarily). The providedKeyValueMapperis applied to each input record and computes a new output record. Thus, an input record<K,V>can be transformed into an output record<K':V'>. This is a stateless record-by-record operation (cf.transform(TransformerSupplier, String...)for stateful record transformation).The example below normalizes the String key to upper-case letters and counts the number of token of the value string.
KStream<String, String> inputStream = builder.stream("topic"); KStream<String, Integer> outputStream = inputStream.map(new KeyValueMapper<String, String, KeyValue<String, Integer>> { KeyValue<String, Integer> apply(String key, String value) { return new KeyValue<>(key.toUpperCase(), value.split(" ").length); } });The provided
KeyValueMappermust return aKeyValuetype and must not returnnull.Mapping records might result in an internal data redistribution if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.mapValues(ValueMapper))- Type Parameters:
KR- the key type of the result streamVR- the value type of the result stream- Parameters:
mapper- aKeyValueMapperthat computes a new output record- Returns:
- a
KStreamthat contains records with new key and value (possibly both of different type) - See Also:
selectKey(KeyValueMapper),flatMap(KeyValueMapper),mapValues(ValueMapper),mapValues(ValueMapperWithKey),flatMapValues(ValueMapper),flatMapValues(ValueMapperWithKey),transform(TransformerSupplier, String...),transformValues(ValueTransformerSupplier, String...),transformValues(ValueTransformerWithKeySupplier, String...)
-
mapValues
<VR> KStream<K,VR> mapValues(ValueMapper<? super V,? extends VR> mapper)
Transform the value of each input record into a new value (with possible new type) of the output record. The providedValueMapperis applied to each input record value and computes a new value for it. Thus, an input record<K,V>can be transformed into an output record<K:V'>. This is a stateless record-by-record operation (cf.transformValues(ValueTransformerSupplier, String...)for stateful value transformation).The example below counts the number of token of the value string.
KStream<String, String> inputStream = builder.stream("topic"); KStream<String, Integer> outputStream = inputStream.mapValues(new ValueMapper<String, Integer> { Integer apply(String value) { return value.split(" ").length; } });Setting a new value preserves data co-location with respect to the key. Thus, no internal data redistribution is required if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.map(KeyValueMapper))- Type Parameters:
VR- the value type of the result stream- Parameters:
mapper- aValueMapperthat computes a new output value- Returns:
- a
KStreamthat contains records with unmodified key and new values (possibly of different type) - See Also:
selectKey(KeyValueMapper),map(KeyValueMapper),flatMap(KeyValueMapper),flatMapValues(ValueMapper),flatMapValues(ValueMapperWithKey),transform(TransformerSupplier, String...),transformValues(ValueTransformerSupplier, String...),transformValues(ValueTransformerWithKeySupplier, String...)
-
mapValues
<VR> KStream<K,VR> mapValues(ValueMapperWithKey<? super K,? super V,? extends VR> mapper)
Transform the value of each input record into a new value (with possible new type) of the output record. The providedValueMapperWithKeyis applied to each input record value and computes a new value for it. Thus, an input record<K,V>can be transformed into an output record<K:V'>. This is a stateless record-by-record operation (cf.transformValues(ValueTransformerSupplier, String...)for stateful value transformation).The example below counts the number of tokens of key and value strings.
KStream<String, String> inputStream = builder.stream("topic"); KStream<String, Integer> outputStream = inputStream.mapValues(new ValueMapperWithKey<String, String, Integer> { Integer apply(String readOnlyKey, String value) { return readOnlyKey.split(" ").length + value.split(" ").length; } });Note that the key is read-only and should not be modified, as this can lead to corrupt partitioning. So, setting a new value preserves data co-location with respect to the key. Thus, no internal data redistribution is required if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.map(KeyValueMapper))- Type Parameters:
VR- the value type of the result stream- Parameters:
mapper- aValueMapperWithKeythat computes a new output value- Returns:
- a
KStreamthat contains records with unmodified key and new values (possibly of different type) - See Also:
selectKey(KeyValueMapper),map(KeyValueMapper),flatMap(KeyValueMapper),flatMapValues(ValueMapper),flatMapValues(ValueMapperWithKey),transform(TransformerSupplier, String...),transformValues(ValueTransformerSupplier, String...),transformValues(ValueTransformerWithKeySupplier, String...)
-
flatMap
<KR,VR> KStream<KR,VR> flatMap(KeyValueMapper<? super K,? super V,? extends java.lang.Iterable<? extends KeyValue<? extends KR,? extends VR>>> mapper)
Transform each record of the input stream into zero or more records in the output stream (both key and value type can be altered arbitrarily). The providedKeyValueMapperis applied to each input record and computes zero or more output records. Thus, an input record<K,V>can be transformed into output records<K':V'>, <K'':V''>, .... This is a stateless record-by-record operation (cf.transform(TransformerSupplier, String...)for stateful record transformation).The example below splits input records
<null:String>containing sentences as values into their words and emit a record<word:1>for each word.KStream<byte[], String> inputStream = builder.stream("topic"); KStream<String, Integer> outputStream = inputStream.flatMap(new KeyValueMapper<byte[], String, Iterable<KeyValue<String, Integer>>> { Iterable<KeyValue<String, Integer>> apply(byte[] key, String value) { String[] tokens = value.split(" "); List<KeyValue<String, Integer>> result = new ArrayList<>(tokens.length); for(String token : tokens) { result.add(new KeyValue<>(token, 1)); } return result; } });The provided
KeyValueMappermust return anIterable(e.g., anyCollectiontype) and the return value must not benull.Flat-mapping records might result in an internal data redistribution if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.flatMapValues(ValueMapper))- Type Parameters:
KR- the key type of the result streamVR- the value type of the result stream- Parameters:
mapper- aKeyValueMapperthat computes the new output records- Returns:
- a
KStreamthat contains more or less records with new key and value (possibly of different type) - See Also:
selectKey(KeyValueMapper),map(KeyValueMapper),mapValues(ValueMapper),mapValues(ValueMapperWithKey),flatMapValues(ValueMapper),flatMapValues(ValueMapperWithKey),transform(TransformerSupplier, String...),transformValues(ValueTransformerSupplier, String...),transformValues(ValueTransformerWithKeySupplier, String...)
-
flatMapValues
<VR> KStream<K,VR> flatMapValues(ValueMapper<? super V,? extends java.lang.Iterable<? extends VR>> mapper)
Create a newKStreamby transforming the value of each record in this stream into zero or more values with the same key in the new stream. Transform the value of each input record into zero or more records with the same (unmodified) key in the output stream (value type can be altered arbitrarily). The providedValueMapperis applied to each input record and computes zero or more output values. Thus, an input record<K,V>can be transformed into output records<K:V'>, <K:V''>, .... This is a stateless record-by-record operation (cf.transformValues(ValueTransformerSupplier, String...)for stateful value transformation).The example below splits input records
<null:String>containing sentences as values into their words.KStream<byte[], String> inputStream = builder.stream("topic"); KStream<byte[], String> outputStream = inputStream.flatMapValues(new ValueMapper<String, Iterable<String>> { Iterable<String> apply(String value) { return Arrays.asList(value.split(" ")); } });The provided
ValueMappermust return anIterable(e.g., anyCollectiontype) and the return value must not benull.Splitting a record into multiple records with the same key preserves data co-location with respect to the key. Thus, no internal data redistribution is required if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.flatMap(KeyValueMapper))- Type Parameters:
VR- the value type of the result stream- Parameters:
mapper- aValueMapperthe computes the new output values- Returns:
- a
KStreamthat contains more or less records with unmodified keys and new values of different type - See Also:
selectKey(KeyValueMapper),map(KeyValueMapper),flatMap(KeyValueMapper),mapValues(ValueMapper),mapValues(ValueMapperWithKey),transform(TransformerSupplier, String...),transformValues(ValueTransformerSupplier, String...),transformValues(ValueTransformerWithKeySupplier, String...)
-
flatMapValues
<VR> KStream<K,VR> flatMapValues(ValueMapperWithKey<? super K,? super V,? extends java.lang.Iterable<? extends VR>> mapper)
Create a newKStreamby transforming the value of each record in this stream into zero or more values with the same key in the new stream. Transform the value of each input record into zero or more records with the same (unmodified) key in the output stream (value type can be altered arbitrarily). The providedValueMapperWithKeyis applied to each input record and computes zero or more output values. Thus, an input record<K,V>can be transformed into output records<K:V'>, <K:V''>, .... This is a stateless record-by-record operation (cf.transformValues(ValueTransformerSupplier, String...)for stateful value transformation).The example below splits input records
<Integer:String>, with key=1, containing sentences as values into their words.KStream<Integer, String> inputStream = builder.stream("topic"); KStream<Integer, String> outputStream = inputStream.flatMapValues(new ValueMapper<Integer, String, Iterable<String>> { Iterable<Integer, String> apply(Integer readOnlyKey, String value) { if(readOnlyKey == 1) { return Arrays.asList(value.split(" ")); } else { return Arrays.asList(value); } } });The provided
ValueMapperWithKeymust return anIterable(e.g., anyCollectiontype) and the return value must not benull.Note that the key is read-only and should not be modified, as this can lead to corrupt partitioning. So, splitting a record into multiple records with the same key preserves data co-location with respect to the key. Thus, no internal data redistribution is required if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.flatMap(KeyValueMapper))- Type Parameters:
VR- the value type of the result stream- Parameters:
mapper- aValueMapperWithKeythe computes the new output values- Returns:
- a
KStreamthat contains more or less records with unmodified keys and new values of different type - See Also:
selectKey(KeyValueMapper),map(KeyValueMapper),flatMap(KeyValueMapper),mapValues(ValueMapper),mapValues(ValueMapperWithKey),transform(TransformerSupplier, String...),transformValues(ValueTransformerSupplier, String...),transformValues(ValueTransformerWithKeySupplier, String...)
-
print
@Deprecated void print()
Deprecated.useprint(Printed)Print the records of this stream toSystem.out. This function will use the generated name of the parent processor node to label the key/value pairs printed to the console.The default serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.
-
print
@Deprecated void print(java.lang.String label)
Deprecated.Print the records of this stream toSystem.out. This function will use the given name to label the key/value pairs printed to the console.The default serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
label- the name used to label the key/value pairs printed to the console
-
print
@Deprecated void print(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)
Deprecated.Print the records of this stream toSystem.out. This function will use the generated name of the parent processor node to label the key/value pairs printed to the console.The provided serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
keySerde- key serde used to deserialize key if type isbyte[],valSerde- value serde used to deserialize value if type isbyte[],
-
print
@Deprecated void print(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String label)
Print the records of this stream toSystem.out.The provided serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
keySerde- key serde used to deserialize key if type isbyte[],valSerde- value serde used to deserialize value if type isbyte[],label- the name used to label the key/value pairs printed to the console
-
print
@Deprecated void print(KeyValueMapper<? super K,? super V,java.lang.String> mapper)
Deprecated.Print the customized output withSystem.out.The default serde will be use to deserialize key or value if type is
byte[]. The user providedKeyValueMapperwhich customizes output is used to print withSystem.outThe example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The KeyValueMapper's mapped value type must be
String.- Parameters:
mapper- aKeyValueMapperthat computes output typeString.
-
print
@Deprecated void print(KeyValueMapper<? super K,? super V,java.lang.String> mapper, java.lang.String label)
Print the customized output withSystem.out.The default serde will be used to deserialize key or value if type is
byte[]. The user providedKeyValueMapperwhich customizes output is used to print withSystem.outThe example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The KeyValueMapper's mapped value type must be
String.- Parameters:
mapper- aKeyValueMapperthat computes output typeString.label- The given name which labels output will be printed.
-
print
@Deprecated void print(KeyValueMapper<? super K,? super V,java.lang.String> mapper, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)
Deprecated.Print the customized output withSystem.out.The user provided
KeyValueMapperwhich customizes output is used to print withSystem.outThe provided serde will be use to deserialize key or value if type isbyte[].The example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The provided KeyValueMapper's mapped value type must be
String.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
mapper- aKeyValueMapperthat computes output typeString.keySerde- aSerdeused to deserialize key if type isbyte[].valSerde- aSerdeused to deserialize value if type isbyte[].
-
print
@Deprecated void print(KeyValueMapper<? super K,? super V,java.lang.String> mapper, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String label)
Print the customized output withSystem.out.The user provided
KeyValueMapperwhich customizes output is used to print withSystem.out. The provided serde will be use to deserialize key or value if type isbyte[].The example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The provided KeyValueMapper's mapped value type must be
String.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
mapper- aKeyValueMapperthat computes output typeString.keySerde- aSerdeused to deserialize key if type isbyte[].valSerde- aSerdeused to deserialize value if type isbyte[].label- The given name which labels output will be printed.
-
print
void print(Printed<K,V> printed)
Print the records of this KStream using the options provided byPrinted- Parameters:
printed- options for printing
-
merge
KStream<K,V> merge(KStream<K,V> stream)
Merge this stream and the given stream into one larger stream.There is no ordering guarantee between records from this
KStreamand records from the providedKStreamin the merged stream. Relative order is preserved within each input stream though (ie, records within one input stream are processed in order).- Parameters:
stream- a stream which is to be merged into this stream- Returns:
- a merged stream containing all records from this and the provided
KStream
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath)
Deprecated.Write the records of this stream to a file at the given path. This function will use the generated name of the parent processor node to label the key/value pairs printed to the file.The default serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
filePath- name of the file to write to
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath, java.lang.String label)Deprecated.Write the records of this stream to a file at the given path. This function will use the given name to label the key/value printed to the file.The default serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
filePath- name of the file to write tolabel- the name used to label the key/value pairs written to the file
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Deprecated.Write the records of this stream to a file at the given path. This function will use the generated name of the parent processor node to label the key/value pairs printed to the file.The provided serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
filePath- name of the file to write tokeySerde- key serde used to deserialize key if type isbyte[],valSerde- value serde used to deserialize value if type isbyte[],
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath, java.lang.String label, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)Write the records of this stream to a file at the given path. This function will use the given name to label the key/value printed to the file.The provided serde will be used to deserialize the key or value in case the type is
byte[]before callingtoString()on the deserialized object.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
filePath- name of the file to write tolabel- the name used to label the key/value pairs written to the filekeySerde- key serde used to deserialize key if type isbyte[],valSerde- value serde used deserialize value if type isbyte[],
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath, KeyValueMapper<? super K,? super V,java.lang.String> mapper)Deprecated.Write the customised output to a given file path.The user provided
KeyValueMapperwhich customizes output is used to write to file. This function will use default name of stream to label records.The default key and value serde will used to deserialize
byte[]records before callingtoString().The example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The KeyValueMapper's mapped value type must be
String.- Parameters:
filePath- path of the file to write to.mapper- aKeyValueMapperthat computes output typeString.
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath, java.lang.String label, KeyValueMapper<? super K,? super V,java.lang.String> mapper)Write the customised output to a given file path.The user provided
KeyValueMapperwhich customizes output is used to write to file. This function will use given name of stream to label records.The default key and value serde will used to deserialize
byte[]records before callingtoString().The example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The KeyValueMapper's mapped value type must be
String.- Parameters:
filePath- path of the file to write to.label- the name used to label records written to file.mapper- aKeyValueMapperthat computes output typeString.
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, KeyValueMapper<? super K,? super V,java.lang.String> mapper)Deprecated.Write the customised output to a given file path.The user provided
KeyValueMapperwhich customizes output is used to write to file. This function will use default name of stream to label records.The given key and value serde will be used to deserialize
byte[]records before callingtoString().The example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The KeyValueMapper's mapped value type must be
String.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
filePath- path of the file to write to.keySerde- key serde used to deserialize key if type isbyte[].valSerde- value serde used to deserialize value if type isbyte[].mapper- aKeyValueMapperthat computes output typeString.
-
writeAsText
@Deprecated void writeAsText(java.lang.String filePath, java.lang.String label, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, KeyValueMapper<? super K,? super V,java.lang.String> mapper)Write the customised output to a given file path.The user provided
KeyValueMapperwhich customizes output is used to write to file. This function will use given name of stream to label records.The given key and value serde will be used to deserialize
byte[]records before callingtoString().The example below shows the way to customize output data.
final KeyValueMapper<Integer, String, String> mapper = new KeyValueMapper<Integer, String, String>() { public String apply(Integer key, String value) { return String.format("(%d, %s)", key, value); } };The KeyValueMapper's mapped value type must be
String.Implementors will need to override
toString()for keys and values that are not of typeString,Integeretc. to get meaningful information.- Parameters:
filePath- path of the file to write to.label- the name used to label records written to file.keySerde- key serde used to deserialize key if type isbyte[].valSerde- value serde used to deserialize value if type isbyte[].mapper- aKeyValueMapperthat computes output typeString.
-
foreach
void foreach(ForeachAction<? super K,? super V> action)
Perform an action on each record ofKStream. This is a stateless record-by-record operation (cf.process(ProcessorSupplier, String...)). Note that this is a terminal operation that returns void.- Parameters:
action- an action to perform on each record- See Also:
process(ProcessorSupplier, String...)
-
peek
KStream<K,V> peek(ForeachAction<? super K,? super V> action)
Perform an action on each record ofKStream. This is a stateless record-by-record operation (cf.process(ProcessorSupplier, String...)).Peek is a non-terminal operation that triggers a side effect (such as logging or statistics collection) and returns an unchanged stream.
Note that since this operation is stateless, it may execute multiple times for a single record in failure cases.
- Parameters:
action- an action to perform on each record- See Also:
process(ProcessorSupplier, String...)
-
branch
KStream<K,V>[] branch(Predicate<? super K,? super V>... predicates)
Creates an array ofKStreamfrom this stream by branching the records in the original stream based on the supplied predicates. Each record is evaluated against the supplied predicates, and predicates are evaluated in order. Each stream in the result array corresponds position-wise (index) to the predicate in the supplied predicates. The branching happens on first-match: A record in the original stream is assigned to the corresponding result stream for the first predicate that evaluates to true, and is assigned to this stream only. A record will be dropped if none of the predicates evaluate to true. This is a stateless record-by-record operation.- Parameters:
predicates- the ordered list ofPredicateinstances- Returns:
- multiple distinct substreams of this
KStream
-
through
KStream<K,V> through(java.lang.String topic)
Materialize this stream to a topic and creates a newKStreamfrom the topic using default serializers and deserializers and producer'sDefaultPartitioner. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).This is equivalent to calling
#to(someTopicName)andStreamsBuilder#stream(someTopicName).- Parameters:
topic- the topic name- Returns:
- a
KStreamthat contains the exact same (and potentially repartitioned) records as thisKStream
-
through
@Deprecated KStream<K,V> through(StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)
Deprecated.Materialize this stream to a topic and creates a newKStreamfrom the topic using default serializers and deserializers and a customizableStreamPartitionerto determine the distribution of records to partitions. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).This is equivalent to calling
#to(StreamPartitioner, someTopicName)andStreamsBuilder#stream(someTopicName).- Parameters:
partitioner- the function used to determine how records are distributed among partitions of the topic, if not specified producer'sDefaultPartitionerwill be usedtopic- the topic name- Returns:
- a
KStreamthat contains the exact same (and potentially repartitioned) records as thisKStream
-
through
@Deprecated KStream<K,V> through(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String topic)
Deprecated.Materialize this stream to a topic, and creates a newKStreamfrom the topic. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).If
keySerdeprovides aWindowedSerializerfor the keyWindowedStreamPartitioneris used—otherwise producer'sDefaultPartitioneris used.This is equivalent to calling
#to(keySerde, valSerde, someTopicName)andKStreamBuilder#stream(keySerde, valSerde, someTopicName).- Parameters:
keySerde- key serde used to send key-value pairs, if not specified the default key serde defined in the configuration will be usedvalSerde- value serde used to send key-value pairs, if not specified the default value serde defined in the configuration will be usedtopic- the topic name- Returns:
- a
KStreamthat contains the exact same (and potentially repartitioned) records as thisKStream
-
through
@Deprecated KStream<K,V> through(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)
Deprecated.Materialize this stream to a topic and creates a newKStreamfrom the topic using a customizableStreamPartitionerto determine the distribution of records to partitions. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).This is equivalent to calling
#to(keySerde, valSerde, StreamPartitioner, someTopicName)andKStreamBuilder#stream(keySerde, valSerde, someTopicName).- Parameters:
keySerde- key serde used to send key-value pairs, if not specified the default key serde defined in the configuration will be usedvalSerde- value serde used to send key-value pairs, if not specified the default value serde defined in the configuration will be usedpartitioner- the function used to determine how records are distributed among partitions of the topic, if not specified andkeySerdeprovides aWindowedSerializerfor the keyWindowedStreamPartitionerwill be used—otherwiseDefaultPartitionerwill be usedtopic- the topic name- Returns:
- a
KStreamthat contains the exact same (and potentially repartitioned) records as thisKStream
-
through
KStream<K,V> through(java.lang.String topic, Produced<K,V> produced)
Materialize this stream to a topic and creates a newKStreamfrom the topic using theProducedinstance for configuration of thekey serde,value serde, andStreamPartitioner. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).This is equivalent to calling
to(someTopic, Produced.with(keySerde, valueSerde)andStreamsBuilder#stream(someTopicName, Consumed.with(keySerde, valueSerde)).- Parameters:
topic-produced-- Returns:
- a
KStreamthat contains the exact same (and potentially repartitioned) records as thisKStream
-
to
void to(java.lang.String topic)
Materialize this stream to a topic using default serializers specified in the config and producer'sDefaultPartitioner. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).- Parameters:
topic- the topic name
-
to
@Deprecated void to(StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)
Deprecated.Materialize this stream to a topic using default serializers specified in the config and a customizableStreamPartitionerto determine the distribution of records to partitions. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).- Parameters:
partitioner- the function used to determine how records are distributed among partitions of the topic, if not specified producer'sDefaultPartitionerwill be usedtopic- the topic name
-
to
@Deprecated void to(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, java.lang.String topic)
Deprecated.Materialize this stream to a topic. IfkeySerdeprovides aWindowedSerializerfor the keyWindowedStreamPartitioneris used—otherwise producer'sDefaultPartitioneris used. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).- Parameters:
keySerde- key serde used to send key-value pairs, if not specified the default serde defined in the configs will be usedvalSerde- value serde used to send key-value pairs, if not specified the default serde defined in the configs will be usedtopic- the topic name
-
to
@Deprecated void to(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde, StreamPartitioner<? super K,? super V> partitioner, java.lang.String topic)
Deprecated.Materialize this stream to a topic using a customizableStreamPartitionerto determine the distribution of records to partitions. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).- Parameters:
keySerde- key serde used to send key-value pairs, if not specified the default serde defined in the configs will be usedvalSerde- value serde used to send key-value pairs, if not specified the default serde defined in the configs will be usedpartitioner- the function used to determine how records are distributed among partitions of the topic, if not specified andkeySerdeprovides aWindowedSerializerfor the keyWindowedStreamPartitionerwill be used—otherwiseDefaultPartitionerwill be usedtopic- the topic name
-
to
void to(java.lang.String topic, Produced<K,V> produced)Materialize this stream to a topic using the providedProducedinstance. The specified topic should be manually created before it is used (i.e., before the Kafka Streams application is started).- Parameters:
produced- the options to use when producing to the topictopic- the topic name
-
transform
<K1,V1> KStream<K1,V1> transform(TransformerSupplier<? super K,? super V,KeyValue<K1,V1>> transformerSupplier, java.lang.String... stateStoreNames)
Transform each record of the input stream into zero or more records in the output stream (both key and value type can be altered arbitrarily). ATransformer(provided by the givenTransformerSupplier) is applied to each input record and computes zero or more output records. Thus, an input record<K,V>can be transformed into output records<K':V'>, <K'':V''>, .... This is a stateful record-by-record operation (cf.flatMap(KeyValueMapper)). Furthermore, viaPunctuator.punctuate(long)the processing progress can be observed and additional periodic actions can be performed.In order to assign a state, the state must be created and registered beforehand:
// create store StoreBuilder<KeyValueStore<String,String>> keyValueStoreBuilder = Stores.keyValueStoreBuilder(Stores.persistentKeyValueStore("myTransformState"), Serdes.String(), Serdes.String()); // register store builder.addStateStore(keyValueStoreBuilder); KStream outputStream = inputStream.transform(new TransformerSupplier() { ... }, "myTransformState");Within the
Transformer, the state is obtained via theProcessorContext. To trigger periodic actions viapunctuate(), a schedule must be registered. TheTransformermust return aKeyValuetype intransform()andpunctuate().new TransformerSupplier() { Transformer get() { return new Transformer() { private ProcessorContext context; private StateStore state; void init(ProcessorContext context) { this.context = context; this.state = context.getStateStore("myTransformState"); // punctuate each 1000ms; can access this.state // can emit as many new KeyValue pairs as required via this.context#forward() context.schedule(1000, PunctuationType.WALL_CLOCK_TIME, new Punctuator(..)); } KeyValue transform(K key, V value) { // can access this.state // can emit as many new KeyValue pairs as required via this.context#forward() return new KeyValue(key, value); // can emit a single value via return -- can also be null } void close() { // can access this.state // can emit as many new KeyValue pairs as required via this.context#forward() } } } }Transforming records might result in an internal data redistribution if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.transformValues(ValueTransformerSupplier, String...))- Type Parameters:
K1- the key type of the new streamV1- the value type of the new stream- Parameters:
transformerSupplier- a instance ofTransformerSupplierthat generates aTransformerstateStoreNames- the names of the state stores used by the processor- Returns:
- a
KStreamthat contains more or less records with new key and value (possibly of different type) - See Also:
flatMap(KeyValueMapper),transformValues(ValueTransformerSupplier, String...),transformValues(ValueTransformerWithKeySupplier, String...),process(ProcessorSupplier, String...)
-
transformValues
<VR> KStream<K,VR> transformValues(ValueTransformerSupplier<? super V,? extends VR> valueTransformerSupplier, java.lang.String... stateStoreNames)
Transform the value of each input record into a new value (with possible new type) of the output record. AValueTransformer(provided by the givenValueTransformerSupplier) is applies to each input record value and computes a new value for it. Thus, an input record<K,V>can be transformed into an output record<K:V'>. This is a stateful record-by-record operation (cf.mapValues(ValueMapper)). Furthermore, viaPunctuator.punctuate(long)the processing progress can be observed and additional periodic actions get be performed.In order to assign a state, the state must be created and registered beforehand:
// create store StoreBuilder<KeyValueStore<String,String>> keyValueStoreBuilder = Stores.keyValueStoreBuilder(Stores.persistentKeyValueStore("myValueTransformState"), Serdes.String(), Serdes.String()); // register store builder.addStateStore(keyValueStoreBuilder); KStream outputStream = inputStream.transformValues(new ValueTransformerSupplier() { ... }, "myValueTransformState");Within the
ValueTransformer, the state is obtained via theProcessorContext. To trigger periodic actions viapunctuate(), a schedule must be registered. In contrast totransform(), no additionalKeyValuepairs should be emitted viaProcessorContext.forward().new ValueTransformerSupplier() { ValueTransformer get() { return new ValueTransformer() { private StateStore state; void init(ProcessorContext context) { this.state = context.getStateStore("myValueTransformState"); context.schedule(1000, PunctuationType.WALL_CLOCK_TIME, new Punctuator(..)); // punctuate each 1000ms, can access this.state } NewValueType transform(V value) { // can access this.state return new NewValueType(); // or null } void close() { // can access this.state } } } }Setting a new value preserves data co-location with respect to the key. Thus, no internal data redistribution is required if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.transform(TransformerSupplier, String...))- Type Parameters:
VR- the value type of the result stream- Parameters:
valueTransformerSupplier- a instance ofValueTransformerSupplierthat generates aValueTransformerstateStoreNames- the names of the state stores used by the processor- Returns:
- a
KStreamthat contains records with unmodified key and new values (possibly of different type) - See Also:
mapValues(ValueMapper),mapValues(ValueMapperWithKey),transform(TransformerSupplier, String...)
-
transformValues
<VR> KStream<K,VR> transformValues(ValueTransformerWithKeySupplier<? super K,? super V,? extends VR> valueTransformerSupplier, java.lang.String... stateStoreNames)
Transform the value of each input record into a new value (with possible new type) of the output record. AValueTransformerWithKey(provided by the givenValueTransformerWithKeySupplier) is applies to each input record value and computes a new value for it. Thus, an input record<K,V>can be transformed into an output record<K:V'>. This is a stateful record-by-record operation (cf.mapValues(ValueMapperWithKey)). Furthermore, viaPunctuator.punctuate(long)the processing progress can be observed and additional periodic actions get be performed.In order to assign a state, the state must be created and registered beforehand:
// create store StoreBuilder<KeyValueStore<String,String>> keyValueStoreBuilder = Stores.keyValueStoreBuilder(Stores.persistentKeyValueStore("myValueTransformState"), Serdes.String(), Serdes.String()); // register store builder.addStateStore(keyValueStoreBuilder); KStream outputStream = inputStream.transformValues(new ValueTransformerWithKeySupplier() { ... }, "myValueTransformState");Within the
ValueTransformerWithKey, the state is obtained via theProcessorContext. To trigger periodic actions viapunctuate(), a schedule must be registered. In contrast totransform(), no additionalKeyValuepairs should be emitted viaProcessorContext.forward().new ValueTransformerWithKeySupplier() { ValueTransformerWithKey get() { return new ValueTransformerWithKey() { private StateStore state; void init(ProcessorContext context) { this.state = context.getStateStore("myValueTransformState"); context.schedule(1000, PunctuationType.WALL_CLOCK_TIME, new Punctuator(..)); // punctuate each 1000ms, can access this.state } NewValueType transform(K readOnlyKey, V value) { // can access this.state and use read-only key return new NewValueType(readOnlyKey); // or null } void close() { // can access this.state } } } }Note that the key is read-only and should not be modified, as this can lead to corrupt partitioning. So, setting a new value preserves data co-location with respect to the key. Thus, no internal data redistribution is required if a key based operator (like an aggregation or join) is applied to the result
KStream. (cf.transform(TransformerSupplier, String...))- Type Parameters:
VR- the value type of the result stream- Parameters:
valueTransformerSupplier- a instance ofValueTransformerWithKeySupplierthat generates aValueTransformerWithKeystateStoreNames- the names of the state stores used by the processor- Returns:
- a
KStreamthat contains records with unmodified key and new values (possibly of different type) - See Also:
mapValues(ValueMapper),mapValues(ValueMapperWithKey),transform(TransformerSupplier, String...)
-
process
void process(ProcessorSupplier<? super K,? super V> processorSupplier, java.lang.String... stateStoreNames)
Process all records in this stream, one record at a time, by applying aProcessor(provided by the givenProcessorSupplier). This is a stateful record-by-record operation (cf.foreach(ForeachAction)). Furthermore, viaPunctuator.punctuate(long)the processing progress can be observed and additional periodic actions can be performed. Note that this is a terminal operation that returns void.In order to assign a state, the state must be created and registered beforehand:
// create store StoreBuilder<KeyValueStore<String,String>> keyValueStoreBuilder = Stores.keyValueStoreBuilder(Stores.persistentKeyValueStore("myProcessorState"), Serdes.String(), Serdes.String()); // register store builder.addStateStore(keyValueStoreBuilder); inputStream.process(new ProcessorSupplier() { ... }, "myProcessorState");Within the
Processor, the state is obtained via theProcessorContext. To trigger periodic actions viapunctuate(), a schedule must be registered.new ProcessorSupplier() { Processor get() { return new Processor() { private StateStore state; void init(ProcessorContext context) { this.state = context.getStateStore("myProcessorState"); context.schedule(1000, PunctuationType.WALL_CLOCK_TIME, new Punctuator(..)); // punctuate each 1000ms, can access this.state } void process(K key, V value) { // can access this.state } void close() { // can access this.state } } } }- Parameters:
processorSupplier- a instance ofProcessorSupplierthat generates aProcessorstateStoreNames- the names of the state store used by the processor- See Also:
foreach(ForeachAction),transform(TransformerSupplier, String...)
-
groupByKey
KGroupedStream<K,V> groupByKey()
Group the records by their current key into aKGroupedStreamwhile preserving the original values and default serializers and deserializers. Grouping a stream on the record key is required before an aggregation operator can be applied to the data (cf.KGroupedStream). If a record key isnullthe record will not be included in the resultingKGroupedStream.If a key changing operator was used before this operation (e.g.,
selectKey(KeyValueMapper),map(KeyValueMapper),flatMap(KeyValueMapper), ortransform(TransformerSupplier, String...)), and no data redistribution happened afterwards (e.g., viathrough(String)) an internal repartitioning topic will be created in Kafka. This topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().For this case, all data of this stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the resulting
KGroupedStreamis partitioned correctly on its key. If the last key changing operator changed the key type, it is recommended to usegroupByKey(Serialized)instead.- Returns:
- a
KGroupedStreamthat contains the grouped records of the originalKStream - See Also:
groupBy(KeyValueMapper)
-
groupByKey
KGroupedStream<K,V> groupByKey(Serialized<K,V> serialized)
Group the records by their current key into aKGroupedStreamwhile preserving the original values and using the serializers as defined bySerialized. Grouping a stream on the record key is required before an aggregation operator can be applied to the data (cf.KGroupedStream). If a record key isnullthe record will not be included in the resultingKGroupedStream.If a key changing operator was used before this operation (e.g.,
selectKey(KeyValueMapper),map(KeyValueMapper),flatMap(KeyValueMapper), ortransform(TransformerSupplier, String...)), and no data redistribution happened afterwards (e.g., viathrough(String)) an internal repartitioning topic will be created in Kafka. This topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().For this case, all data of this stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the resulting
KGroupedStreamis partitioned correctly on its key.- Returns:
- a
KGroupedStreamthat contains the grouped records of the originalKStream - See Also:
groupBy(KeyValueMapper)
-
groupByKey
@Deprecated KGroupedStream<K,V> groupByKey(org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)
Deprecated.Group the records by their current key into aKGroupedStreamwhile preserving the original values. Grouping a stream on the record key is required before an aggregation operator can be applied to the data (cf.KGroupedStream). If a record key isnullthe record will not be included in the resultingKGroupedStream.If a key changing operator was used before this operation (e.g.,
selectKey(KeyValueMapper),map(KeyValueMapper),flatMap(KeyValueMapper), ortransform(TransformerSupplier, String...)), and no data redistribution happened afterwards (e.g., viathrough(String)) an internal repartitioning topic will be created in Kafka. This topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().For this case, all data of this stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the resulting
KGroupedStreamis partitioned correctly on its key.- Parameters:
keySerde- key serdes for materializing this stream, if not specified the default serdes defined in the configs will be usedvalSerde- value serdes for materializing this stream, if not specified the default serdes defined in the configs will be used- Returns:
- a
KGroupedStreamthat contains the grouped records of the originalKStream
-
groupBy
<KR> KGroupedStream<KR,V> groupBy(KeyValueMapper<? super K,? super V,KR> selector)
Group the records of thisKStreamon a new key that is selected using the providedKeyValueMapperand default serializers and deserializers. Grouping a stream on the record key is required before an aggregation operator can be applied to the data (cf.KGroupedStream). TheKeyValueMapperselects a new key (with should be of the same type) while preserving the original values. If the new record key isnullthe record will not be included in the resultingKGroupedStreamBecause a new key is selected, an internal repartitioning topic will be created in Kafka. This topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified in
StreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().All data of this stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the resulting
KGroupedStreamis partitioned on the new key.This operation is equivalent to calling
selectKey(KeyValueMapper)followed bygroupByKey(). If the key type is changed, it is recommended to usegroupBy(KeyValueMapper, Serialized)instead.- Type Parameters:
KR- the key type of the resultKGroupedStream- Parameters:
selector- aKeyValueMapperthat computes a new key for grouping- Returns:
- a
KGroupedStreamthat contains the grouped records of the originalKStream
-
groupBy
<KR> KGroupedStream<KR,V> groupBy(KeyValueMapper<? super K,? super V,KR> selector, Serialized<KR,V> serialized)
Group the records of thisKStreamon a new key that is selected using the providedKeyValueMapperandSerdes as specified bySerialized. Grouping a stream on the record key is required before an aggregation operator can be applied to the data (cf.KGroupedStream). TheKeyValueMapperselects a new key (with should be of the same type) while preserving the original values. If the new record key isnullthe record will not be included in the resultingKGroupedStream.Because a new key is selected, an internal repartitioning topic will be created in Kafka. This topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified in
StreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().All data of this stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the resulting
KGroupedStreamis partitioned on the new key.This operation is equivalent to calling
selectKey(KeyValueMapper)followed bygroupByKey().- Type Parameters:
KR- the key type of the resultKGroupedStream- Parameters:
selector- aKeyValueMapperthat computes a new key for grouping- Returns:
- a
KGroupedStreamthat contains the grouped records of the originalKStream
-
groupBy
@Deprecated <KR> KGroupedStream<KR,V> groupBy(KeyValueMapper<? super K,? super V,KR> selector, org.apache.kafka.common.serialization.Serde<KR> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)
Deprecated.Group the records of thisKStreamon a new key that is selected using the providedKeyValueMapper. Grouping a stream on the record key is required before an aggregation operator can be applied to the data (cf.KGroupedStream). TheKeyValueMapperselects a new key (with potentially different type) while preserving the original values. If the new record key isnullthe record will not be included in the resultingKGroupedStream.Because a new key is selected, an internal repartitioning topic will be created in Kafka. This topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified in
StreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().All data of this stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the resulting
KGroupedStreamis partitioned on the new key.This is equivalent to calling
selectKey(KeyValueMapper)followed bygroupByKey(Serde, Serde).- Type Parameters:
KR- the key type of the resultKGroupedStream- Parameters:
selector- aKeyValueMapperthat computes a new key for groupingkeySerde- key serdes for materializing this stream, if not specified the default serdes defined in the configs will be usedvalSerde- value serdes for materializing this stream, if not specified the default serdes defined in the configs will be used- Returns:
- a
KGroupedStreamthat contains the grouped records of the originalKStream - See Also:
groupByKey()
-
join
<VO,VR> KStream<K,VR> join(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows)
Join records of this stream with anotherKStream's records using windowed inner equi join with default serializers and deserializers. The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K2:B> <K2:b> <K2:ValueJoiner(B,b)> <K3:c> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindows- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key and within the joining window intervals - See Also:
leftJoin(KStream, ValueJoiner, JoinWindows),outerJoin(KStream, ValueJoiner, JoinWindows)
-
join
<VO,VR> KStream<K,VR> join(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, Joined<K,V,VO> joined)
Join records of this stream with anotherKStream's records using windowed inner equi join with default serializers and deserializers. The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K2:B> <K2:b> <K2:ValueJoiner(B,b)> <K3:c> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindowsjoined- aJoinedinstance that defines the serdes to be used to serialize/deserialize inputs and outputs of the joined streams- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key and within the joining window intervals - See Also:
leftJoin(KStream, ValueJoiner, JoinWindows, Joined),outerJoin(KStream, ValueJoiner, JoinWindows, Joined)
-
join
@Deprecated <VO,VR> KStream<K,VR> join(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> thisValueSerde, org.apache.kafka.common.serialization.Serde<VO> otherValueSerde)
Deprecated.Join records of this stream with anotherKStream's records using windowed inner equi join. The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K2:B> <K2:b> <K2:ValueJoiner(B,b)> <K3:c> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindowskeySerde- key serdes for materializing both streams, if not specified the default serdes defined in the configs will be usedthisValueSerde- value serdes for materializing this stream, if not specified the default serdes defined in the configs will be usedotherValueSerde- value serdes for materializing the other stream, if not specified the default serdes defined in the configs will be used- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key and within the joining window intervals - See Also:
leftJoin(KStream, ValueJoiner, JoinWindows, Joined),outerJoin(KStream, ValueJoiner, JoinWindows, Joined)
-
leftJoin
<VO,VR> KStream<K,VR> leftJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows)
Join records of this stream with anotherKStream's records using windowed left equi join with default serializers and deserializers. In contrast toinner-join, all records from this stream will produce at least one output record (cf. below). The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. Furthermore, for each input record of thisKStreamthat does not satisfy the join predicate the providedValueJoinerwill be called with anullvalue for the other stream. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K1:ValueJoiner(A,null)> <K2:B> <K2:b> <K2:ValueJoiner(B,b)> <K3:c> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindows- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key plus one for each non-matching record of thisKStreamand within the joining window intervals - See Also:
join(KStream, ValueJoiner, JoinWindows),outerJoin(KStream, ValueJoiner, JoinWindows)
-
leftJoin
<VO,VR> KStream<K,VR> leftJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, Joined<K,V,VO> joined)
Join records of this stream with anotherKStream's records using windowed left equi join with default serializers and deserializers. In contrast toinner-join, all records from this stream will produce at least one output record (cf. below). The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. Furthermore, for each input record of thisKStreamthat does not satisfy the join predicate the providedValueJoinerwill be called with anullvalue for the other stream. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K1:ValueJoiner(A,null)> <K2:B> <K2:b> <K2:ValueJoiner(B,b)> <K3:c> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindowsjoined- aJoinedinstance that defines the serdes to be used to serialize/deserialize inputs and outputs of the joined streams- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key plus one for each non-matching record of thisKStreamand within the joining window intervals - See Also:
join(KStream, ValueJoiner, JoinWindows, Joined),outerJoin(KStream, ValueJoiner, JoinWindows, Joined)
-
leftJoin
@Deprecated <VO,VR> KStream<K,VR> leftJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> thisValSerde, org.apache.kafka.common.serialization.Serde<VO> otherValueSerde)
Deprecated.Join records of this stream with anotherKStream's records using windowed left equi join. In contrast toinner-join, all records from this stream will produce at least one output record (cf. below). The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. Furthermore, for each input record of thisKStreamthat does not satisfy the join predicate the providedValueJoinerwill be called with anullvalue for the other stream. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K1:ValueJoiner(A,null)> <K2:B> <K2:b> <K2:ValueJoiner(B,b)> <K3:c> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindowskeySerde- key serdes for materializing the other stream, if not specified the default serdes defined in the configs will be usedthisValSerde- value serdes for materializing this stream, if not specified the default serdes defined in the configs will be usedotherValueSerde- value serdes for materializing the other stream, if not specified the default serdes defined in the configs will be used- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key plus one for each non-matching record of thisKStreamand within the joining window intervals - See Also:
join(KStream, ValueJoiner, JoinWindows, Joined),outerJoin(KStream, ValueJoiner, JoinWindows, Joined)
-
outerJoin
<VO,VR> KStream<K,VR> outerJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows)
Join records of this stream with anotherKStream's records using windowed outer equi join with default serializers and deserializers. In contrast toinner-joinorleft-join, all records from both streams will produce at least one output record (cf. below). The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. Furthermore, for each input record of bothKStreams that does not satisfy the join predicate the providedValueJoinerwill be called with anullvalue for the this/other stream, respectively. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K1:ValueJoiner(A,null)> <K2:B> <K2:b> <K2:ValueJoiner(null,b)>
<K2:ValueJoiner(B,b)><K3:c> <K3:ValueJoiner(null,c)> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindows- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key plus one for each non-matching record of bothKStreamand within the joining window intervals - See Also:
join(KStream, ValueJoiner, JoinWindows),leftJoin(KStream, ValueJoiner, JoinWindows)
-
outerJoin
<VO,VR> KStream<K,VR> outerJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, Joined<K,V,VO> joined)
Join records of this stream with anotherKStream's records using windowed outer equi join with default serializers and deserializers. In contrast toinner-joinorleft-join, all records from both streams will produce at least one output record (cf. below). The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. Furthermore, for each input record of bothKStreams that does not satisfy the join predicate the providedValueJoinerwill be called with anullvalue for the this/other stream, respectively. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K1:ValueJoiner(A,null)> <K2:B> <K2:b> <K2:ValueJoiner(null,b)>
<K2:ValueJoiner(B,b)><K3:c> <K3:ValueJoiner(null,c)> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindows- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key plus one for each non-matching record of bothKStreamand within the joining window intervals - See Also:
join(KStream, ValueJoiner, JoinWindows, Joined),leftJoin(KStream, ValueJoiner, JoinWindows, Joined)
-
outerJoin
@Deprecated <VO,VR> KStream<K,VR> outerJoin(KStream<K,VO> otherStream, ValueJoiner<? super V,? super VO,? extends VR> joiner, JoinWindows windows, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> thisValueSerde, org.apache.kafka.common.serialization.Serde<VO> otherValueSerde)
Deprecated.Join records of this stream with anotherKStream's records using windowed outer equi join. In contrast toinner-joinorleft-join, all records from both streams will produce at least one output record (cf. below). The join is computed on the records' key with join attributethisKStream.key == otherKStream.key. Furthermore, two records are only joined if their timestamps are close to each other as defined by the givenJoinWindows, i.e., the window defines an additional join predicate on the record timestamps.For each pair of records meeting both join predicates the provided
ValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. Furthermore, for each input record of bothKStreams that does not satisfy the join predicate the providedValueJoinerwill be called with anullvalue for this/other stream, respectively. If an input record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example (assuming all input records belong to the correct windows):
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callthis other result <K1:A> <K1:ValueJoiner(A,null)> <K2:B> <K2:b> <K2:ValueJoiner(null,b)>
<K2:ValueJoiner(B,b)><K3:c> <K3:ValueJoiner(null,c)> through(String)(for one input stream) before doing the join, using a pre-created topic with the "correct" number of partitions. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen for one or both of the joining
KStreams. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.Both of the joining
KStreams will be materialized in local state stores with auto-generated store names. For failure and recovery each store will be backed by an internal changelog topic that will be created in Kafka. The changelog topic will be named "${applicationId}-storeName-changelog", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "storeName" is an internally generated name, and "-changelog" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().- Type Parameters:
VO- the value type of the other streamVR- the value type of the result stream- Parameters:
otherStream- theKStreamto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordswindows- the specification of theJoinWindowskeySerde- key serdes for materializing both streams, if not specified the default serdes defined in the configs will be usedthisValueSerde- value serdes for materializing this stream, if not specified the default serdes defined in the configs will be usedotherValueSerde- value serdes for materializing the other stream, if not specified the default serdes defined in the configs will be used- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key plus one for each non-matching record of bothKStreams and within the joining window intervals - See Also:
join(KStream, ValueJoiner, JoinWindows, Joined),leftJoin(KStream, ValueJoiner, JoinWindows, Joined)
-
join
<VT,VR> KStream<K,VR> join(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner)
Join records of this stream withKTable's records using non-windowed inner equi join with default serializers and deserializers. The join is a primary key table lookup join with join attributestream.key == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current (i.e., processing time) internalKTablestate. In contrast, processingKTableinput records will only update the internalKTablestate and will not produce any result records.For each
KStreamrecord that finds a corresponding record inKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. If anKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example:
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callKStream KTable state result <K1:A> <K1:b> <K1:b> <K1:C> <K1:b> <K1:ValueJoiner(C,b)> through(String)for thisKStreambefore doing the join, using a pre-created topic with the same number of partitions as the givenKTable. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner); cf.join(GlobalKTable, KeyValueMapper, ValueJoiner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen only for this
KStreambut not for the providedKTable. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.- Type Parameters:
VT- the value type of the tableVR- the value type of the result stream- Parameters:
table- theKTableto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching records- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key - See Also:
leftJoin(KTable, ValueJoiner),join(GlobalKTable, KeyValueMapper, ValueJoiner)
-
join
<VT,VR> KStream<K,VR> join(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, Joined<K,V,VT> joined)
Join records of this stream withKTable's records using non-windowed inner equi join with default serializers and deserializers. The join is a primary key table lookup join with join attributestream.key == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current (i.e., processing time) internalKTablestate. In contrast, processingKTableinput records will only update the internalKTablestate and will not produce any result records.For each
KStreamrecord that finds a corresponding record inKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. If anKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example:
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callKStream KTable state result <K1:A> <K1:b> <K1:b> <K1:C> <K1:b> <K1:ValueJoiner(C,b)> through(String)for thisKStreambefore doing the join, using a pre-created topic with the same number of partitions as the givenKTable. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner); cf.join(GlobalKTable, KeyValueMapper, ValueJoiner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen only for this
KStreambut not for the providedKTable. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.- Type Parameters:
VT- the value type of the tableVR- the value type of the result stream- Parameters:
table- theKTableto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordsjoined- aJoinedinstance that defines the serdes to be used to serialize/deserialize inputs of the joined streams- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key - See Also:
leftJoin(KTable, ValueJoiner, Joined),join(GlobalKTable, KeyValueMapper, ValueJoiner)
-
join
@Deprecated <VT,VR> KStream<K,VR> join(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)
Deprecated.Join records of this stream withKTable's records using non-windowed inner equi join. The join is a primary key table lookup join with join attributestream.key == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current (i.e., processing time) internalKTablestate. In contrast, processingKTableinput records will only update the internalKTablestate and will not produce any result records.For each
KStreamrecord that finds a corresponding record inKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as for both joining input records. If anKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example:
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callKStream KTable state result <K1:A> <K1:b> <K1:b> <K1:C> <K1:b> <K1:ValueJoiner(C,b)> through(String)for thisKStreambefore doing the join, using a pre-created topic with the same number of partitions as the givenKTable. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner); cf.join(GlobalKTable, KeyValueMapper, ValueJoiner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen only for this
KStreambut not for the providedKTable. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.- Type Parameters:
VT- the value type of the tableVR- the value type of the result stream- Parameters:
table- theKTableto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordskeySerde- key serdes for materializing this (KStreaminput) stream If not specified the default serdes defined in the configs will be usedvalSerde- value serdes for materializing this (KStreaminput) stream, if not specified the default serdes defined in the configs will be used- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one for each matched record-pair with the same key - See Also:
leftJoin(KTable, ValueJoiner, Joined),join(GlobalKTable, KeyValueMapper, ValueJoiner)
-
leftJoin
<VT,VR> KStream<K,VR> leftJoin(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner)
Join records of this stream withKTable's records using non-windowed left equi join with default serializers and deserializers. In contrast toinner-join, all records from this stream will produce an output record (cf. below). The join is a primary key table lookup join with join attributestream.key == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current (i.e., processing time) internalKTablestate. In contrast, processingKTableinput records will only update the internalKTablestate and will not produce any result records.For each
KStreamrecord whether or not it finds a corresponding record inKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. If noKTablerecord was found during lookup, anullvalue will be provided toValueJoiner. The key of the result record is the same as for both joining input records. If anKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example:
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callKStream KTable state result <K1:A> <K1:ValueJoiner(A,null)> <K1:b> <K1:b> <K1:C> <K1:b> <K1:ValueJoiner(C,b)> through(String)for thisKStreambefore doing the join, using a pre-created topic with the same number of partitions as the givenKTable. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner); cf.join(GlobalKTable, KeyValueMapper, ValueJoiner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen only for this
KStreambut not for the providedKTable. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.- Type Parameters:
VT- the value type of the tableVR- the value type of the result stream- Parameters:
table- theKTableto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching records- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one output for each inputKStreamrecord - See Also:
join(KTable, ValueJoiner),leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)
-
leftJoin
<VT,VR> KStream<K,VR> leftJoin(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, Joined<K,V,VT> joined)
Join records of this stream withKTable's records using non-windowed left equi join with default serializers and deserializers. In contrast toinner-join, all records from this stream will produce an output record (cf. below). The join is a primary key table lookup join with join attributestream.key == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current (i.e., processing time) internalKTablestate. In contrast, processingKTableinput records will only update the internalKTablestate and will not produce any result records.For each
KStreamrecord whether or not it finds a corresponding record inKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. If noKTablerecord was found during lookup, anullvalue will be provided toValueJoiner. The key of the result record is the same as for both joining input records. If anKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example:
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callKStream KTable state result <K1:A> <K1:ValueJoiner(A,null)> <K1:b> <K1:b> <K1:C> <K1:b> <K1:ValueJoiner(C,b)> through(String)for thisKStreambefore doing the join, using a pre-created topic with the same number of partitions as the givenKTable. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner); cf.join(GlobalKTable, KeyValueMapper, ValueJoiner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen only for this
KStreambut not for the providedKTable. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.- Type Parameters:
VT- the value type of the tableVR- the value type of the result stream- Parameters:
table- theKTableto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching records- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one output for each inputKStreamrecord - See Also:
join(KTable, ValueJoiner, Joined),leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)
-
leftJoin
@Deprecated <VT,VR> KStream<K,VR> leftJoin(KTable<K,VT> table, ValueJoiner<? super V,? super VT,? extends VR> joiner, org.apache.kafka.common.serialization.Serde<K> keySerde, org.apache.kafka.common.serialization.Serde<V> valSerde)
Deprecated.Join records of this stream withKTable's records using non-windowed left equi join. In contrast toinner-join, all records from this stream will produce an output record (cf. below). The join is a primary key table lookup join with join attributestream.key == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current (i.e., processing time) internalKTablestate. In contrast, processingKTableinput records will only update the internalKTablestate and will not produce any result records.For each
KStreamrecord whether or not it finds a corresponding record inKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. If noKTablerecord was found during lookup, anullvalue will be provided toValueJoiner. The key of the result record is the same as for both joining input records. If anKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream.Example:
Both input streams (or to be more precise, their underlying source topics) need to have the same number of partitions. If this is not the case, you would need to callKStream KTable state result <K1:A> <K1:ValueJoiner(A,null)> <K1:b> <K1:b> <K1:C> <K1:b> <K1:ValueJoiner(C,b)> through(String)for thisKStreambefore doing the join, using a pre-created topic with the same number of partitions as the givenKTable. Furthermore, both input streams need to be co-partitioned on the join key (i.e., use the same partitioner); cf.join(GlobalKTable, KeyValueMapper, ValueJoiner). If this requirement is not met, Kafka Streams will automatically repartition the data, i.e., it will create an internal repartitioning topic in Kafka and write and re-read the data via this topic before the actual join. The repartitioning topic will be named "${applicationId}-XXX-repartition", where "applicationId" is user-specified inStreamsConfigvia parameterAPPLICATION_ID_CONFIG, "XXX" is an internally generated name, and "-repartition" is a fixed suffix. You can retrieve all generated internal topic names viaKafkaStreams.toString().Repartitioning can happen only for this
KStreambut not for the providedKTable. For this case, all data of the stream will be redistributed through the repartitioning topic by writing all records to it, and rereading all records from it, such that the join inputKStreamis partitioned correctly on its key.- Type Parameters:
VT- the value type of the tableVR- the value type of the result stream- Parameters:
table- theKTableto be joined with this streamjoiner- aValueJoinerthat computes the join result for a pair of matching recordskeySerde- key serdes for materializing this (KStreaminput) stream If not specified the default serdes defined in the configs will be usedvalSerde- value serdes for materializing this (KStreaminput) stream, if not specified the default serdes defined in the configs will be used- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one output for each inputKStreamrecord - See Also:
join(KTable, ValueJoiner, Serde, Serde),leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)
-
join
<GK,GV,RV> KStream<K,RV> join(GlobalKTable<GK,GV> globalKTable, KeyValueMapper<? super K,? super V,? extends GK> keyValueMapper, ValueJoiner<? super V,? super GV,? extends RV> joiner)
Join records of this stream withGlobalKTable's records using non-windowed inner equi join. The join is a primary key table lookup join with join attributekeyValueMapper.map(stream.keyValue) == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current internalGlobalKTablestate. In contrast, processingGlobalKTableinput records will only update the internalGlobalKTablestate and will not produce any result records.For each
KStreamrecord that finds a corresponding record inGlobalKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as the key of thisKStream. If aKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream. IfkeyValueMapperreturnsnullimplying no match exists, no output record will be added to the resultingKStream.- Type Parameters:
GK- the key type ofGlobalKTableGV- the value type of theGlobalKTableRV- the value type of the resultingKStream- Parameters:
globalKTable- theGlobalKTableto be joined with this streamkeyValueMapper- instance ofKeyValueMapperused to map from the (key, value) of this stream to the key of theGlobalKTablejoiner- aValueJoinerthat computes the join result for a pair of matching records- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one output for each inputKStreamrecord - See Also:
leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)
-
leftJoin
<GK,GV,RV> KStream<K,RV> leftJoin(GlobalKTable<GK,GV> globalKTable, KeyValueMapper<? super K,? super V,? extends GK> keyValueMapper, ValueJoiner<? super V,? super GV,? extends RV> valueJoiner)
Join records of this stream withGlobalKTable's records using non-windowed left equi join. In contrast toinner-join, all records from this stream will produce an output record (cf. below). The join is a primary key table lookup join with join attributekeyValueMapper.map(stream.keyValue) == table.key. "Table lookup join" means, that results are only computed ifKStreamrecords are processed. This is done by performing a lookup for matching records in the current internalGlobalKTablestate. In contrast, processingGlobalKTableinput records will only update the internalGlobalKTablestate and will not produce any result records.For each
KStreamrecord whether or not it finds a corresponding record inGlobalKTablethe providedValueJoinerwill be called to compute a value (with arbitrary type) for the result record. The key of the result record is the same as thisKStream. If aKStreaminput record key or value isnullthe record will not be included in the join operation and thus no output record will be added to the resultingKStream. IfkeyValueMapperreturnsnullimplying no match exists, anullvalue will be provided toValueJoiner. If noGlobalKTablerecord was found during lookup, anullvalue will be provided toValueJoiner.- Type Parameters:
GK- the key type ofGlobalKTableGV- the value type of theGlobalKTableRV- the value type of the resultingKStream- Parameters:
globalKTable- theGlobalKTableto be joined with this streamkeyValueMapper- instance ofKeyValueMapperused to map from the (key, value) of this stream to the key of theGlobalKTablevalueJoiner- aValueJoinerthat computes the join result for a pair of matching records- Returns:
- a
KStreamthat contains join-records for each key and values computed by the givenValueJoiner, one output for each inputKStreamrecord - See Also:
join(GlobalKTable, KeyValueMapper, ValueJoiner)
-
-