Class BlobServer

java.lang.Object
java.lang.Thread
org.apache.flink.runtime.blob.BlobServer
All Implemented Interfaces:
Closeable, AutoCloseable, Runnable, BlobService, BlobWriter, PermanentBlobService, TransientBlobService, GloballyCleanableResource, LocallyCleanableResource

This class implements the BLOB server. The BLOB server is responsible for listening for incoming requests and spawning threads to handle these requests. Furthermore, it takes care of creating the directory structure to store the BLOBs or temporarily cache them.
  • Constructor Details

    • BlobServer

      @VisibleForTesting public BlobServer(org.apache.flink.configuration.Configuration config, File storageDir, BlobStore blobStore) throws IOException
      Throws:
      IOException
    • BlobServer

      public BlobServer(org.apache.flink.configuration.Configuration config, org.apache.flink.util.Reference<File> storageDir, BlobStore blobStore) throws IOException
      Instantiates a new BLOB server and binds it to a free network port.
      Parameters:
      config - Configuration to be used to instantiate the BlobServer
      storageDir - storage directory for the blobs
      blobStore - BlobStore to store blobs persistently
      Throws:
      IOException - thrown if the BLOB server cannot bind to a free network port or if the (local or distributed) file storage cannot be created or is not usable
  • Method Details

    • getStorageDir

      public File getStorageDir()
    • getStorageLocation

      @VisibleForTesting public File getStorageLocation(@Nullable org.apache.flink.api.common.JobID jobId, BlobKey key) throws IOException
      Returns a file handle to the file associated with the given blob key on the blob server.

      This is only called from BlobServerConnection or unit tests.

      Parameters:
      jobId - ID of the job this blob belongs to (or null if job-unrelated)
      key - identifying the file
      Returns:
      file handle to the file
      Throws:
      IOException - if creating the directory fails
    • run

      public void run()
      Specified by:
      run in interface Runnable
      Overrides:
      run in class Thread
    • close

      public void close() throws IOException
      Shuts down the BLOB server.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • createClient

      protected BlobClient createClient() throws IOException
      Throws:
      IOException
    • getFile

      public File getFile(TransientBlobKey key) throws IOException
      Retrieves the local path of a (job-unrelated) file associated with a job and a blob key.

      The blob server looks the blob key up in its local storage. If the file exists, it is returned. If the file does not exist, it is retrieved from the HA blob store (if available) or a FileNotFoundException is thrown.

      Specified by:
      getFile in interface TransientBlobService
      Parameters:
      key - blob key associated with the requested file
      Returns:
      file referring to the local storage location of the BLOB
      Throws:
      IOException - Thrown if the file retrieval failed.
    • getFile

      public File getFile(org.apache.flink.api.common.JobID jobId, TransientBlobKey key) throws IOException
      Retrieves the local path of a file associated with a job and a blob key.

      The blob server looks the blob key up in its local storage. If the file exists, it is returned. If the file does not exist, it is retrieved from the HA blob store (if available) or a FileNotFoundException is thrown.

      Specified by:
      getFile in interface TransientBlobService
      Parameters:
      jobId - ID of the job this blob belongs to
      key - blob key associated with the requested file
      Returns:
      file referring to the local storage location of the BLOB
      Throws:
      IOException - Thrown if the file retrieval failed.
    • getFile

      public File getFile(org.apache.flink.api.common.JobID jobId, PermanentBlobKey key) throws IOException
      Returns the path to a local copy of the file associated with the provided job ID and blob key.

      We will first attempt to serve the BLOB from the local storage. If the BLOB is not in there, we will try to download it from the HA store.

      Specified by:
      getFile in interface PermanentBlobService
      Parameters:
      jobId - ID of the job this blob belongs to
      key - blob key associated with the requested file
      Returns:
      The path to the file.
      Throws:
      FileNotFoundException - if the BLOB does not exist;
      IOException - if any other error occurs when retrieving the file
    • putTransient

      public TransientBlobKey putTransient(byte[] value) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the (job-unrelated) data of the given byte array to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      value - the buffer to upload
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while uploading the data to the BLOB server
    • putTransient

      public TransientBlobKey putTransient(org.apache.flink.api.common.JobID jobId, byte[] value) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the data of the given byte array for the given job to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      jobId - the ID of the job the BLOB belongs to
      value - the buffer to upload
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while uploading the data to the BLOB server
    • putTransient

      public TransientBlobKey putTransient(InputStream inputStream) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the (job-unrelated) data from the given input stream to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      inputStream - the input stream to read the data from
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while reading the data from the input stream or uploading the data to the BLOB server
    • putTransient

      public TransientBlobKey putTransient(org.apache.flink.api.common.JobID jobId, InputStream inputStream) throws IOException
      Description copied from interface: TransientBlobService
      Uploads the data from the given input stream for the given job to the BLOB server.
      Specified by:
      putTransient in interface TransientBlobService
      Parameters:
      jobId - ID of the job this blob belongs to
      inputStream - the input stream to read the data from
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while reading the data from the input stream or uploading the data to the BLOB server
    • putPermanent

      public PermanentBlobKey putPermanent(org.apache.flink.api.common.JobID jobId, byte[] value) throws IOException
      Description copied from interface: BlobWriter
      Uploads the data of the given byte array for the given job to the BLOB server and makes it a permanent BLOB.
      Specified by:
      putPermanent in interface BlobWriter
      Parameters:
      jobId - the ID of the job the BLOB belongs to
      value - the buffer to upload
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while writing it to a local file, or uploading it to the HA store
    • putPermanent

      public PermanentBlobKey putPermanent(org.apache.flink.api.common.JobID jobId, InputStream inputStream) throws IOException
      Description copied from interface: BlobWriter
      Uploads the data from the given input stream for the given job to the BLOB server and makes it a permanent BLOB.
      Specified by:
      putPermanent in interface BlobWriter
      Parameters:
      jobId - ID of the job this blob belongs to
      inputStream - the input stream to read the data from
      Returns:
      the computed BLOB key identifying the BLOB on the server
      Throws:
      IOException - thrown if an I/O error occurs while reading the data from the input stream, writing it to a local file, or uploading it to the HA store
    • deleteFromCache

      public boolean deleteFromCache(TransientBlobKey key)
      Deletes the (job-unrelated) file associated with the blob key in the local storage of the blob server.
      Specified by:
      deleteFromCache in interface TransientBlobService
      Parameters:
      key - blob key associated with the file to be deleted
      Returns:
      true if the given blob is successfully deleted or non-existing; false otherwise
    • deleteFromCache

      public boolean deleteFromCache(org.apache.flink.api.common.JobID jobId, TransientBlobKey key)
      Deletes the file associated with the blob key in the local storage of the blob server.
      Specified by:
      deleteFromCache in interface TransientBlobService
      Parameters:
      jobId - ID of the job this blob belongs to
      key - blob key associated with the file to be deleted
      Returns:
      true if the given blob is successfully deleted or non-existing; false otherwise
    • deletePermanent

      public boolean deletePermanent(org.apache.flink.api.common.JobID jobId, PermanentBlobKey key)
      Delete the uploaded data with the given JobID and PermanentBlobKey.
      Specified by:
      deletePermanent in interface BlobWriter
      Parameters:
      jobId - ID of the job this blob belongs to
      key - the key of this blob
    • localCleanupAsync

      public CompletableFuture<Void> localCleanupAsync(org.apache.flink.api.common.JobID jobId, Executor cleanupExecutor)
      Deletes locally stored artifacts for the job represented by the given JobID. This doesn't touch the job's entry in the BlobStore to enable recovering.
      Specified by:
      localCleanupAsync in interface LocallyCleanableResource
      Parameters:
      jobId - The JobID of the job that is subject to cleanup.
      cleanupExecutor - The fallback executor for IO-heavy operations.
      Returns:
      The cleanup result future.
    • globalCleanupAsync

      public CompletableFuture<Void> globalCleanupAsync(org.apache.flink.api.common.JobID jobId, Executor executor)
      Removes all BLOBs from local and HA store belonging to the given JobID.
      Specified by:
      globalCleanupAsync in interface GloballyCleanableResource
      Parameters:
      jobId - ID of the job this blob belongs to
      executor - The fallback executor for IO-heavy operations.
      Returns:
      The cleanup result future.
    • retainJobs

      public void retainJobs(Collection<org.apache.flink.api.common.JobID> jobsToRetain, Executor ioExecutor) throws IOException
      Throws:
      IOException
    • getPermanentBlobService

      public PermanentBlobService getPermanentBlobService()
      Description copied from interface: BlobService
      Returns a BLOB service for accessing permanent BLOBs.
      Specified by:
      getPermanentBlobService in interface BlobService
      Returns:
      BLOB service
    • getTransientBlobService

      public TransientBlobService getTransientBlobService()
      Description copied from interface: BlobService
      Returns a BLOB service for accessing transient BLOBs.
      Specified by:
      getTransientBlobService in interface BlobService
      Returns:
      BLOB service
    • getMinOffloadingSize

      public final int getMinOffloadingSize()
      Returns the configuration used by the BLOB server.
      Specified by:
      getMinOffloadingSize in interface BlobWriter
      Returns:
      configuration
    • getPort

      public int getPort()
      Returns the port on which the server is listening.
      Specified by:
      getPort in interface BlobService
      Returns:
      port on which the server is listening
    • isShutdown

      public boolean isShutdown()
      Tests whether the BLOB server has been requested to shut down.
      Returns:
      True, if the server has been requested to shut down, false otherwise.