Class TransientBlobCache

java.lang.Object
org.apache.flink.runtime.blob.AbstractBlobCache
org.apache.flink.runtime.blob.TransientBlobCache
All Implemented Interfaces:
Closeable, AutoCloseable, TransientBlobService

public class TransientBlobCache extends AbstractBlobCache implements TransientBlobService
Provides access to transient BLOB files stored at the BlobServer.

TODO: make this truly transient by returning file streams to a local copy with the remote being removed upon retrieval and the local copy being deleted at the end of the stream.

  • Constructor Details

    • TransientBlobCache

      @VisibleForTesting public TransientBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, File storageDir, @Nullable InetSocketAddress serverAddress) throws IOException
      Throws:
      IOException
    • TransientBlobCache

      public TransientBlobCache(org.apache.flink.configuration.Configuration blobClientConfig, org.apache.flink.util.Reference<File> storageDir, @Nullable InetSocketAddress serverAddress) throws IOException
      Instantiates a new BLOB cache.
      Parameters:
      blobClientConfig - global configuration
      storageDir - storage directory for the cached blobs
      serverAddress - address of the BlobServer to use for fetching files from or null if none yet
      Throws:
      IOException - thrown if the (local or distributed) file storage cannot be created or is not usable
  • Method Details

    • getFile

      public File getFile(TransientBlobKey key) throws IOException
      Description copied from interface: TransientBlobService
      Returns the path to a local copy of the (job-unrelated) file associated with the provided blob key.
      Specified by:
      getFile in interface TransientBlobService
      Parameters:
      key - blob key associated with the requested file
      Returns:
      The path to the file.
      Throws:
      FileNotFoundException - when the path does not exist;
      IOException - if any other error occurs when retrieving the file
    • getFile

      public File getFile(org.apache.flink.api.common.JobID jobId, TransientBlobKey key) throws IOException
      Description copied from interface: TransientBlobService
      Returns the path to a local copy of the file associated with the provided job ID and blob key.
      Specified by:
      getFile in interface TransientBlobService
      Parameters:
      jobId - ID of the job this blob belongs to
      key - blob key associated with the requested file
      Returns:
      The path to the file.
      Throws:
      FileNotFoundException - when the path does not exist;
      IOException - if any other error occurs when retrieving the file
    • getFileInternal

      protected File getFileInternal(@Nullable org.apache.flink.api.common.JobID jobId, BlobKey blobKey) throws IOException
      Description copied from class: AbstractBlobCache
      Returns local copy of the file for the BLOB with the given key.

      The method will first attempt to serve the BLOB from its local cache. If the BLOB is not in the cache, the method will try to download it from this cache's BLOB server via a distributed BLOB store (if available) or direct end-to-end download.

      Overrides:
      getFileInternal in class AbstractBlobCache
      Parameters:
      jobId - ID of the job this blob belongs to (or null if job-unrelated)
      blobKey - The key of the desired BLOB.
      Returns:
      file referring to the local storage location of the BLOB.
      Throws:
      IOException - Thrown if an I/O error occurs while downloading the BLOBs from the BLOB server.
    • putTransient

      public TransientBlobKey putTransient(byte[] value) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the (job-unrelated) data of the given byte array to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      value - the buffer to upload
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while uploading the data to the BLOB server
    • putTransient

      public TransientBlobKey putTransient(org.apache.flink.api.common.JobID jobId, byte[] value) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the data of the given byte array for the given job to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      jobId - the ID of the job the BLOB belongs to
      value - the buffer to upload
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while uploading the data to the BLOB server
    • putTransient

      public TransientBlobKey putTransient(InputStream inputStream) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the (job-unrelated) data from the given input stream to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      inputStream - the input stream to read the data from
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while reading the data from the input stream or uploading the data to the BLOB server
    • putTransient

      public TransientBlobKey putTransient(org.apache.flink.api.common.JobID jobId, InputStream inputStream) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the data from the given input stream for the given job to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      jobId - ID of the job this blob belongs to
      inputStream - the input stream to read the data from
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while reading the data from the input stream or uploading the data to the BLOB server
    • deleteFromCache

      public boolean deleteFromCache(TransientBlobKey key)
      Description copied from interface: TransientBlobService
      Deletes the (job-unrelated) file associated with the provided blob key from the local cache.
      Specified by:
      deleteFromCache in interface TransientBlobService
      Parameters:
      key - associated with the file to be deleted
      Returns:
      true if the given blob is successfully deleted or non-existing; false otherwise
    • deleteFromCache

      public boolean deleteFromCache(org.apache.flink.api.common.JobID jobId, TransientBlobKey key)
      Description copied from interface: TransientBlobService
      Deletes the file associated with the provided job ID and blob key from the local cache.
      Specified by:
      deleteFromCache in interface TransientBlobService
      Parameters:
      jobId - ID of the job this blob belongs to
      key - associated with the file to be deleted
      Returns:
      true if the given blob is successfully deleted or non-existing; false otherwise
    • getStorageLocation

      @VisibleForTesting public File getStorageLocation(@Nullable org.apache.flink.api.common.JobID jobId, BlobKey key) throws IOException
      Returns a file handle to the file associated with the given blob key on the blob server.
      Parameters:
      jobId - ID of the job this blob belongs to (or null if job-unrelated)
      key - identifying the file
      Returns:
      file handle to the file
      Throws:
      IOException - if creating the directory fails
    • cancelCleanupTask

      protected void cancelCleanupTask()
      Description copied from class: AbstractBlobCache
      Cancels any cleanup task that subclasses may be executing.

      This is called during AbstractBlobCache.close().

      Specified by:
      cancelCleanupTask in class AbstractBlobCache