Docker GPU Discovery and Mounting

This page explains how GPUs are discovered using containerized environments, as well as how these devices are exposed to the grading pipeline.

GPU Discovery

Like other host resources, resource discovery is performed in DockerDaemonDispatcher, when each of the daemon hosts are created. However, since Docker does not have information regarding the installed GPUs on the host, we need to use other means to obtain the list of GPUs on a system.

`lspci`

First of all, we need to obtain a list of all GPUs present on the host system. Since /sys is exposed in Docker containers, we can use lspci to obtain the list of PCI devices on the host system. lspci also has a machine-parsable format which helps the Grader to determine GPUs in all PCI devices.

We use lspci -Dvmm, which has the following effect (see lspci(8)):

-D: Displays the full PCI domain/bus (will be used to locate the card/render device of each GPU)
-v: Enables verbose output
-mm: Outputs the PCI device data in a machine-readable format

After obtaining the output of lspci, we filter all devices which have a class of VGA compatible controller, and we will have a list of all GPUs on the system.

DRI Card and Rendering Devices

In addition to the PCI bus number, the card and render device number is required for AMD/Intel GPUs to be used in containers. This will also be obtained using the discovery phase, and will be retrieved by listing all symlinks under /dev/dri/by-path. Each device will be listed in the form of pci-${pciBus}-{card,render}, and each device can be mapped to the DRI devices by replacing ${pciBus} with the full PCI bus of the device.

In addition, we also resolve the symlink to the actual device name under /dev/dri.

NVIDIA GPUs

Additionally, we also need to know the device IDs of each NVIDIA GPU as seen from the NVIDIA kernel module, as Docker’s Container Toolkit uses internal GPU IDs to determine which GPU to mount into the system.

From some sources, it appears that GPU IDs are enumerated via the PCI major bus, so the order of devices outputted by lspci is sufficient for mapping each GPU device to the index of the device as understood by NVIDIA tools.

GPU Allocation

GPUs are allocated using the current DockerDaemonDispatcher logic.

Mounting to a Container

Once a GPU has been allocated to a runner, DockerPipelineStage will be responsible for mounting the devices to the containers.

NVIDIA

NVIDIA GPUs will use Docker’s device request functionality to mount the GPU into the container. The device ID of the NVIDIA GPU will be used here.

AMD/Intel

Other GPUs will use Docker’s device functionality to mount the GPU into the container. The /dev/dri path of the GPU will be used here, and will be mounted to the same path in the container. The container will be given read/write access to the GPU (it is unknown whether mknod permission is required).