Interface BucketAssigner<IN,BucketID>
- Type Parameters:
IN- The type of input elements.BucketID- The type of the object returned by thegetBucketId(Object, BucketAssigner.Context). This has to have a correct#hashCode()and#equals(Object)method. In addition, thePathto the created bucket will be the result of the#toString()of this method, appended to thebasePathspecified in the file sink.
- All Superinterfaces:
Serializable
- All Known Implementing Classes:
BasePathBucketAssigner,DateTimeBucketAssigner
A BucketAssigner is used with a file sink to determine the bucket each incoming element should be
put into.
The StreamingFileSink can be writing to many buckets at a time, and it is responsible
for managing a set of active buckets. Whenever a new element arrives it will ask the
BucketAssigner for the bucket the element should fall in. The BucketAssigner can, for
example, determine buckets based on system time.
-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic interfaceContext that theBucketAssignercan use for getting additional data about an input record. -
Method Summary
Modifier and TypeMethodDescriptiongetBucketId(IN element, BucketAssigner.Context context) Returns the identifier of the bucket the provided element should be put into.org.apache.flink.core.io.SimpleVersionedSerializer<BucketID>
-
Method Details
-
getBucketId
Returns the identifier of the bucket the provided element should be put into.- Parameters:
element- The current element being processed.context- The context used by the current bucket assigner.- Returns:
- A string representing the identifier of the bucket the element should be put into.
The actual path to the bucket will result from the concatenation of the returned string
and the
base pathprovided during the initialization of the file sink.
-
getSerializer
org.apache.flink.core.io.SimpleVersionedSerializer<BucketID> getSerializer()- Returns:
- A
SimpleVersionedSerializercapable of serializing/deserializing the elements of typeBucketID. That is the type of the objects returned by thegetBucketId(Object, BucketAssigner.Context).
-