Interface BucketAssigner<IN,BucketID>

Type Parameters:
IN - The type of input elements.
BucketID - The type of the object returned by the getBucketId(Object, BucketAssigner.Context). This has to have a correct #hashCode() and #equals(Object) method. In addition, the Path to the created bucket will be the result of the #toString() of this method, appended to the basePath specified in the file sink.
All Superinterfaces:
Serializable
All Known Implementing Classes:
BasePathBucketAssigner, DateTimeBucketAssigner

@PublicEvolving public interface BucketAssigner<IN,BucketID> extends Serializable
A BucketAssigner is used with a file sink to determine the bucket each incoming element should be put into.

The StreamingFileSink can be writing to many buckets at a time, and it is responsible for managing a set of active buckets. Whenever a new element arrives it will ask the BucketAssigner for the bucket the element should fall in. The BucketAssigner can, for example, determine buckets based on system time.

  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Interface
    Description
    static interface 
    Context that the BucketAssigner can use for getting additional data about an input record.
  • Method Summary

    Modifier and Type
    Method
    Description
    Returns the identifier of the bucket the provided element should be put into.
    org.apache.flink.core.io.SimpleVersionedSerializer<BucketID>
     
  • Method Details

    • getBucketId

      BucketID getBucketId(IN element, BucketAssigner.Context context)
      Returns the identifier of the bucket the provided element should be put into.
      Parameters:
      element - The current element being processed.
      context - The context used by the current bucket assigner.
      Returns:
      A string representing the identifier of the bucket the element should be put into. The actual path to the bucket will result from the concatenation of the returned string and the base path provided during the initialization of the file sink.
    • getSerializer

      org.apache.flink.core.io.SimpleVersionedSerializer<BucketID> getSerializer()
      Returns:
      A SimpleVersionedSerializer capable of serializing/deserializing the elements of type BucketID. That is the type of the objects returned by the getBucketId(Object, BucketAssigner.Context).