Enum SubtaskStateMapper
- All Implemented Interfaces:
Serializable,Comparable<SubtaskStateMapper>
SubtaskStateMapper narrows down the subtasks that need to be read during rescaling to
recover from a particular subtask when in-flight data has been stored in the checkpoint.
Mappings of old subtasks to new subtasks may be unique or non-unique. A unique assignment
means that a particular old subtask is only assigned to exactly one new subtask. Non-unique
assignments require filtering downstream. That means that the receiver side has to cross-verify
for a deserialized record if it truly belongs to the new subtask or not. Most
SubtaskStateMapper will only produce unique assignments and are thus optimal. Some rescaler,
such as RANGE, create a mixture of unique and non-unique mappings, where downstream
tasks need to filter on some mapped subtasks.
-
Enum Constant Summary
Enum ConstantsEnum ConstantDescriptionExtra state is redistributed to other subtasks without any specific guarantee (only that up- and downstream are matched).Restores extra subtasks to the first subtask.Replicates the state to all subtasks.Remaps old ranges to new ranges.Redistributes subtask state in a round robin fashion. -
Method Summary
Modifier and TypeMethodDescriptiongetNewToOldSubtasksMapping(int oldParallelism, int newParallelism) Returns a mapping new subtask index to all old subtask indexes.abstract int[]getOldSubtasks(int newSubtaskIndex, int oldNumberOfSubtasks, int newNumberOfSubtasks) Returns all old subtask indexes that need to be read to restore all buffers for the given new subtask index on rescale.booleanReturns true iff this mapper can potentially lead to ambiguous mappings where the different new subtasks map to the same old subtask.static SubtaskStateMapperReturns the enum constant of this type with the specified name.static SubtaskStateMapper[]values()Returns an array containing the constants of this enum type, in the order they are declared.
-
Enum Constant Details
-
ARBITRARY
Extra state is redistributed to other subtasks without any specific guarantee (only that up- and downstream are matched). -
FIRST
Restores extra subtasks to the first subtask. -
FULL
Replicates the state to all subtasks. This rescaling causes a huge overhead and completely relies on filtering the data downstream.This strategy should only be used as a fallback.
-
RANGE
Remaps old ranges to new ranges. For minor rescaling that means that new subtasks are mostly assigned 2 old subtasks.Example:
old assignment: 0 -> [0;43); 1 -> [43;87); 2 -> [87;128)
new assignment: 0 -> [0;64]; 1 -> [64;128)
subtask 0 recovers data from old subtask 0 + 1 and subtask 1 recovers data from old subtask 1 + 2For all downscale from n to [n-1 .. n/2], each new subtasks get exactly two old subtasks assigned.
For all upscale from n to [n+1 .. 2*n-1], most subtasks get two old subtasks assigned, except the two outermost.
Larger scale factors (
<n/2,>2*n), will increase the number of old subtasks accordingly. However, they will also create more unique assignment, where an old subtask is exclusively assigned to a new subtask. Thus, the number of non-unique mappings is upper bound by 2*n. -
ROUND_ROBIN
Redistributes subtask state in a round robin fashion. Returns a mapping ofnewIndex -> oldIndexes. The mapping is accessed by usingBitset oldIndexes = mapping.get(newIndex).For
oldParallelism < newParallelism, that mapping is trivial. For example if oldParallelism = 6 and newParallelism = 10.New index Old indexes 0 0 1 1 ... 5 5 6 ... 9 For
oldParallelism > newParallelism, new indexes get multiple assignments by wrapping around assignments in a round-robin fashion. For example if oldParallelism = 10 and newParallelism = 4.New index Old indexes 0 0, 4, 8 1 1, 5, 9 2 2, 6 3 3, 7 -
UNSUPPORTED
-
-
Method Details
-
values
Returns an array containing the constants of this enum type, in the order they are declared.- Returns:
- an array containing the constants of this enum type, in the order they are declared
-
valueOf
Returns the enum constant of this type with the specified name. The string must match exactly an identifier used to declare an enum constant in this type. (Extraneous whitespace characters are not permitted.)- Parameters:
name- the name of the enum constant to be returned.- Returns:
- the enum constant with the specified name
- Throws:
IllegalArgumentException- if this enum type has no constant with the specified nameNullPointerException- if the argument is null
-
getOldSubtasks
public abstract int[] getOldSubtasks(int newSubtaskIndex, int oldNumberOfSubtasks, int newNumberOfSubtasks) Returns all old subtask indexes that need to be read to restore all buffers for the given new subtask index on rescale. -
getNewToOldSubtasksMapping
Returns a mapping new subtask index to all old subtask indexes. -
isAmbiguous
public boolean isAmbiguous()Returns true iff this mapper can potentially lead to ambiguous mappings where the different new subtasks map to the same old subtask. The assumption is that such replicated data needs to be filtered.
-