All Known Implementing Classes:
ProducerMergedPartitionFileReader, SegmentPartitionFileReader

public interface PartitionFileReader
PartitionFileReader defines the read logic for different types of shuffle files.
  • Method Details

    • readBuffer

      @Nullable PartitionFileReader.ReadBufferResult readBuffer(TieredStoragePartitionId partitionId, TieredStorageSubpartitionId subpartitionId, int segmentId, int bufferIndex, org.apache.flink.core.memory.MemorySegment memorySegment, BufferRecycler recycler, @Nullable PartitionFileReader.ReadProgress readProgress, @Nullable CompositeBuffer partialBuffer) throws IOException
      Read a buffer from the partition file.
      Parameters:
      partitionId - the partition id of the buffer
      subpartitionId - the subpartition id of the buffer
      segmentId - the segment id of the buffer
      bufferIndex - the index of buffer
      memorySegment - the empty buffer to store the read buffer
      recycler - the buffer recycler
      readProgress - the current read progress. The progress comes from the previous ReadBufferResult. Note that the read progress should be implemented and provided by Flink, and it should be directly tied to the file format. The field can be null if the current file reader has no the read progress
      partialBuffer - the previous partial buffer. The partial buffer is not null only when the last read has a partial buffer, it will construct a full buffer during the read process.
      Returns:
      null if there is no data otherwise return a read buffer result.
      Throws:
      IOException - if an error happens.
    • getPriority

      long getPriority(TieredStoragePartitionId partitionId, TieredStorageSubpartitionId subpartitionId, int segmentId, int bufferIndex, @Nullable PartitionFileReader.ReadProgress readProgress) throws IOException
      Get the priority for reading a particular buffer from the partitioned file. The priority is defined as, it is suggested to read buffers with higher priority (smaller value) in prior to buffers with lower priority (larger value).

      Depending on the partition file implementation, following the suggestions should typically result in better performance and efficiency. This can be achieved by e.g. choosing preloaded data over others, optimizing the order of disk access to be more sequential, etc.

      Note: Priorities are suggestions rather than a requirements. The caller can still read data in whichever order it wants.

      Parameters:
      partitionId - the partition id of the buffer
      subpartitionId - the subpartition id of the buffer
      segmentId - the segment id of the buffer
      bufferIndex - the index of buffer
      readProgress - the current read progress. The progress comes from the previous ReadBufferResult. Note that the read progress should be implemented and provided by Flink, and it should be directly tied to the file format. The field can be null if the current file reader has no the read progress
      Returns:
      the priority of the PartitionFileReader.
      Throws:
      IOException - if an error happens.
    • release

      void release()
      Release the PartitionFileReader.