Interface StateBackend

All Superinterfaces:
Serializable
All Known Subinterfaces:
ConfigurableStateBackend, DelegatingStateBackend
All Known Implementing Classes:
AbstractFileStateBackend, AbstractManagedMemoryStateBackend, AbstractStateBackend, BatchExecutionStateBackend, HashMapStateBackend

@PublicEvolving public interface StateBackend extends Serializable
A State Backend defines how the state of a streaming application is stored locally within the cluster. Different State Backends store their state in different fashions, and use different data structures to hold the state of a running application.

For example, the hashmap state backend keeps working state in the memory of the TaskManager. The backend is lightweight and without additional dependencies.

The EmbeddedRocksDBStateBackend stores working state in an embedded RocksDB and is able to scale working state to many terabytes in size, only limited by available disk space across all task managers.

Raw Bytes Storage and Backends

The StateBackend creates services for keyed state and operator state.

The CheckpointableKeyedStateBackend and OperatorStateBackend created by this state backend define how to hold the working state for keys and operators. They also define how to checkpoint that state, frequently using the raw bytes storage (via the CheckpointStreamFactory). However, it is also possible that for example a keyed state backend simply implements the bridge to a key/value store, and that it does not need to store anything in the raw byte storage upon a checkpoint.

Serializability

State Backends need to be serializable, because they distributed across parallel processes (for distributed execution) together with the streaming application code.

Because of that, StateBackend implementations (typically subclasses of AbstractStateBackend) are meant to be like factories that create the proper states stores that provide access to the persistent storage and hold the keyed- and operator state data structures. That way, the State Backend can be very lightweight (contain only configurations) which makes it easier to be serializable.

Thread Safety

State backend implementations have to be thread-safe. Multiple threads may be creating keyed-/operator state backends concurrently.

  • Method Details

    • getName

      default String getName()
      Return the name of this backend, default is simple class name. DelegatingStateBackend may return the simple class name of the delegated backend.
    • createKeyedStateBackend

      <K> CheckpointableKeyedStateBackend<K> createKeyedStateBackend(StateBackend.KeyedStateBackendParameters<K> parameters) throws Exception
      Creates a new CheckpointableKeyedStateBackend that is responsible for holding keyed state and checkpointing it.

      Keyed State is state where each value is bound to a key.

      Type Parameters:
      K - The type of the keys by which the state is organized.
      Parameters:
      parameters - The arguments bundle for creating CheckpointableKeyedStateBackend.
      Returns:
      The Keyed State Backend for the given job, operator, and key group range.
      Throws:
      Exception - This method may forward all exceptions that occur while instantiating the backend.
    • createAsyncKeyedStateBackend

      @Experimental default <K> AsyncKeyedStateBackend<K> createAsyncKeyedStateBackend(StateBackend.KeyedStateBackendParameters<K> parameters) throws Exception
      Creates a new AsyncKeyedStateBackend which supports to access keyed state asynchronously.

      Keyed State is state where each value is bound to a key.

      Type Parameters:
      K - The type of the keys by which the state is organized.
      Parameters:
      parameters - The arguments bundle for creating AsyncKeyedStateBackend.
      Returns:
      The Async Keyed State Backend for the given job, operator.
      Throws:
      Exception - This method may forward all exceptions that occur while instantiating the backend.
    • supportsAsyncKeyedStateBackend

      @Experimental default boolean supportsAsyncKeyedStateBackend()
      Tells if a state backend supports the AsyncKeyedStateBackend.

      If a state backend supports AsyncKeyedStateBackend, it could use createAsyncKeyedStateBackend(KeyedStateBackendParameters) to create an async keyed state backend to access keyed state asynchronously.

      Returns:
      If the state backend supports AsyncKeyedStateBackend.
    • createOperatorStateBackend

      OperatorStateBackend createOperatorStateBackend(StateBackend.OperatorStateBackendParameters parameters) throws Exception
      Creates a new OperatorStateBackend that can be used for storing operator state.

      Operator state is state that is associated with parallel operator (or function) instances, rather than with keys.

      Parameters:
      parameters - The arguments bundle for creating OperatorStateBackend.
      Returns:
      The OperatorStateBackend for operator identified by the job and operator identifier.
      Throws:
      Exception - This method may forward all exceptions that occur while instantiating the backend.
    • useManagedMemory

      default boolean useManagedMemory()
      Whether the state backend uses Flink's managed memory.
    • supportsNoClaimRestoreMode

      default boolean supportsNoClaimRestoreMode()
      Tells if a state backend supports the RecoveryClaimMode.NO_CLAIM mode.

      If a state backend supports NO_CLAIM mode, it should create an independent snapshot when it receives CheckpointType.FULL_CHECKPOINT in Snapshotable.snapshot(long, long, CheckpointStreamFactory, CheckpointOptions).

      Returns:
      If the state backend supports RecoveryClaimMode.NO_CLAIM mode.
    • supportsSavepointFormat

      default boolean supportsSavepointFormat(org.apache.flink.core.execution.SavepointFormatType formatType)