DevicePlugin, DevicePluginSchedulerpublic class NvidiaGPUPluginForRuntimeV2 extends java.lang.Object implements DevicePlugin, DevicePluginScheduler
| Modifier and Type | Class | Description |
|---|---|---|
static class |
NvidiaGPUPluginForRuntimeV2.DeviceLinkType |
Different type of link.
|
class |
NvidiaGPUPluginForRuntimeV2.NvidiaCommandExecutor |
A shell wrapper class easy for test.
|
| Modifier and Type | Field | Description |
|---|---|---|
static org.slf4j.Logger |
LOG |
|
static java.lang.String |
NV_RESOURCE_NAME |
|
static java.lang.String |
TOPOLOGY_POLICY_ENV_KEY |
The container can set this environment variable.
|
static java.lang.String |
TOPOLOGY_POLICY_PACK |
Schedule policy that prefer the faster GPU-GPU communication.
|
static java.lang.String |
TOPOLOGY_POLICY_SPREAD |
Schedule policy that prefer the faster CPU-GPU communication.
|
| Constructor | Description |
|---|---|
NvidiaGPUPluginForRuntimeV2() |
| Modifier and Type | Method | Description |
|---|---|---|
java.util.Set<Device> |
allocateDevices(java.util.Set<Device> availableDevices,
int count,
java.util.Map<java.lang.String,java.lang.String> envs) |
Called when allocating devices.
|
void |
basicSchedule(java.util.Set<Device> allocation,
int count,
java.util.Set<Device> availableDevices) |
|
int |
computeCostOfDevices(Device[] devices) |
The cost function used to calculate costs of a sub set of devices.
|
java.util.Map<java.lang.Integer,java.util.List<java.util.Map.Entry<java.util.Set<Device>,java.lang.Integer>>> |
getCostTable() |
|
java.util.Map<java.lang.String,java.lang.Integer> |
getDevicePairToWeight() |
|
java.util.Set<Device> |
getDevices() |
Called when update node resource.
|
DeviceRegisterRequest |
getRegisterRequestInfo() |
Called first when device plugin framework wants to register.
|
void |
initCostTable() |
|
boolean |
isTopoInitialized() |
|
DeviceRuntimeSpec |
onDevicesAllocated(java.util.Set<Device> allocatedDevices,
YarnRuntimeType yarnRuntime) |
Asking how these devices should be prepared/used
before/when container launch.
|
void |
onDevicesReleased(java.util.Set<Device> releasedDevices) |
Called after device released.
|
void |
parseTopo(java.lang.String topo,
java.util.Map<java.lang.String,java.lang.Integer> deviceLinkToWeight) |
A typical sample topo output:
GPU0 GPU1 GPU2 GPU3 CPU Affinity
GPU0 X PHB SOC SOC 0-31
GPU1 PHB X SOC SOC 0-31
GPU2 SOC SOC X PHB 0-31
GPU3 SOC SOC PHB X 0-31
Legend:
X = Self
SOC = Connection traversing PCIe as well as the SMP link between
CPU sockets(e.g.
|
void |
setPathOfGpuBinary(java.lang.String pOfGpuBinary) |
|
void |
setShellExecutor(NvidiaGPUPluginForRuntimeV2.NvidiaCommandExecutor shellExecutor) |
|
void |
topologyAwareSchedule(java.util.Set<Device> allocation,
int count,
java.util.Map<java.lang.String,java.lang.String> envs,
java.util.Set<Device> availableDevices,
java.util.Map<java.lang.Integer,java.util.List<java.util.Map.Entry<java.util.Set<Device>,java.lang.Integer>>> cTable) |
Topology Aware schedule algorithm.
|
public static final org.slf4j.Logger LOG
public static final java.lang.String NV_RESOURCE_NAME
public static final java.lang.String TOPOLOGY_POLICY_ENV_KEY
public static final java.lang.String TOPOLOGY_POLICY_PACK
public static final java.lang.String TOPOLOGY_POLICY_SPREAD
public DeviceRegisterRequest getRegisterRequestInfo() throws java.lang.Exception
DevicePlugingetRegisterRequestInfo in interface DevicePluginDeviceRegisterRequestjava.lang.Exceptionpublic java.util.Set<Device> getDevices() throws java.lang.Exception
DevicePlugingetDevices in interface DevicePluginDevice, TreeSet recommendedjava.lang.Exceptionpublic DeviceRuntimeSpec onDevicesAllocated(java.util.Set<Device> allocatedDevices, YarnRuntimeType yarnRuntime) throws java.lang.Exception
DevicePluginVolumeSpec to let the
framework to create volume before running container.onDevicesAllocated in interface DevicePluginallocatedDevices - A set of allocated Device.yarnRuntime - Indicate which runtime YARN will use
Could be RUNTIME_DEFAULT or RUNTIME_DOCKER
in DeviceRuntimeSpec constants. The default means YARN's
non-docker container runtime is used. The docker means YARN's
docker container runtime is used.DeviceRuntimeSpec description about environment,
VolumeSpec, MountVolumeSpec. etcjava.lang.Exceptionpublic void onDevicesReleased(java.util.Set<Device> releasedDevices) throws java.lang.Exception
DevicePluginonDevicesReleased in interface DevicePluginreleasedDevices - A set of released devicesjava.lang.Exceptionpublic java.util.Set<Device> allocateDevices(java.util.Set<Device> availableDevices, int count, java.util.Map<java.lang.String,java.lang.String> envs)
DevicePluginSchedulerallocateDevices in interface DevicePluginScheduleravailableDevices - Devices allowed to be chosen from.count - Number of device to be allocated.envs - Environment variables of the container.Device allocated@VisibleForTesting
public void initCostTable()
throws java.io.IOException
java.io.IOException@VisibleForTesting public int computeCostOfDevices(Device[] devices)
@VisibleForTesting public void topologyAwareSchedule(java.util.Set<Device> allocation, int count, java.util.Map<java.lang.String,java.lang.String> envs, java.util.Set<Device> availableDevices, java.util.Map<java.lang.Integer,java.util.List<java.util.Map.Entry<java.util.Set<Device>,java.lang.Integer>>> cTable)
@VisibleForTesting public void basicSchedule(java.util.Set<Device> allocation, int count, java.util.Set<Device> availableDevices)
public void parseTopo(java.lang.String topo,
java.util.Map<java.lang.String,java.lang.Integer> deviceLinkToWeight)
@VisibleForTesting public void setPathOfGpuBinary(java.lang.String pOfGpuBinary)
@VisibleForTesting public void setShellExecutor(NvidiaGPUPluginForRuntimeV2.NvidiaCommandExecutor shellExecutor)
@VisibleForTesting public boolean isTopoInitialized()
@VisibleForTesting public java.util.Map<java.lang.Integer,java.util.List<java.util.Map.Entry<java.util.Set<Device>,java.lang.Integer>>> getCostTable()
@VisibleForTesting public java.util.Map<java.lang.String,java.lang.Integer> getDevicePairToWeight()
Copyright © 2008–2025 Apache Software Foundation. All rights reserved.