install NVIDIA supported docker

Mon, 2025-04-28
install NVIDIA supported docker

docker

official doc

# 拷自 official doc
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

user add to docker group

useradd -a -G docker <username>

重登

add proxy

systemctl edit docker
[Service]
Environment="HTTP_PROXY=http://localhost:8118"
Environment="HTTPS_PROXY=http://localhost:8118"

NVIDIA

分两部分

可以从 container 的 official doc 入手。

driver

container 第一节给了一个链接跳转到 cuda-installation doc,我们不需要在 host 机上装 cuda,所以点右边 4. Driver Installation,直接给了个 tesla 的 driver installation doc

先下 keyring

# ubuntu2204/x86_64
wget https://developer.download.nvidia.com/compute/cuda/repos/<distro>/<arch>/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb

选择开源还是闭源驱动可以看 5. Kernel Modules

新的是 open source 的 driver 的

sudo apt-get update
apt install nvidia-open

nvidia-smi
reboot

NVIDIA Container Toolkit

official doc

# 拷自 official doc
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

nvidia-container-runtime

# 拷自 official doc
sudo nvidia-ctk runtime configure --runtime=docker
# /etc/docker/daemon.json

btrfs

/etc/docker/daemon.json

{
    "storage-driver": "btrfs"
}

results

/etc/docker/daemon.json

{
    "storage-driver": "btrfs",
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

# 拷自 official doc
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

--runtime=nvidia 通过这个将显卡传递到 docker 里

avatar
除非注明,本博客所有文章皆为原创。
本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。