The purpose of the previous post: Feeding an NVIDIA GPU to k3s on Proxmox was to make this post possible. This post is about how I managed to make hardware-accelerated transcoding work in a k3s cluster using an NVIDIA GPU.
Environment
GPU: NVIDIA RTX 5060
Kubernetes: v1.32.6+k3s1
NVIDIA driver version: 570.158.01
NVIDIA GPU Operator version: v25.3.2
Initial attempt
With NVIDIA drivers and GPU Operator in place, I thought I could just run ffmpeg in a container to utilize hardware-accelerated transcoding. I was wrong.
This is the container I used to test it out:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
apiVersion: apps/v1
kind: Deployment
metadata:
name: nvidia-test
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: nvidia-test
template:
metadata:
labels:
app: nvidia-test
spec:
runtimeClassName: nvidia
containers:
- name: nvidia-test
image: nvidia/cuda:12.0.0-base-ubuntu22.04
imagePullPolicy: IfNotPresent
command:
- sleep
- "3600"
resources:
limits:
nvidia.com/gpu: "1"
restartPolicy: Always
|
Once the pod was running,
1
2
3
4
5
|
kubectl cp input.mp4 nvidia-test-774694874d-p4226:/root/input.mp4
kubectl exec -it nvidia-test-774694874d-p4226 -- bash
apt update && apt install -y ffmpeg
cd
ffmpeg -y -hwaccel cuda -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
|
And that got ffmpeg yelling at me:
1
2
3
4
5
6
7
|
[h264 @ 0x5b2e39aad740] Cannot load libnvcuvid.so.1
[h264 @ 0x5b2e39aad740] Failed loading nvcuvid.
[h264 @ 0x5b2e39aad740] Failed setup for format cuda: hwaccel initialisation returned error.
[h264_nvenc @ 0x5b2e386c3080] Cannot load libnvidia-encode.so.1
[h264_nvenc @ 0x5b2e386c3080] The minimum required Nvidia driver for nvenc is (unknown) or newer
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Conversion failed!
|
Seems like the container was missing two libraries: libnvcuvid.so.1
and libnvidia-encode.so.1
, which sound like they are related to video encoding.
With some prompt engineering, I figured out what I needed. The libnvidia-encode-570-server
needed to be installed for hardware-accelerated encoding to work, and since I installed all the drivers on the worker node VM, that’s where I need to install it.
Making the node happy
Installing the library was easy enough:
1
|
apt install -y libnvidia-encode-570-server
|
But when I restarted the pod and tried to run ffmpeg again, it yelled at me like nothing has changed:
1
2
|
[h264 @ 0x5b2e39aad740] Cannot load libnvcuvid.so.1
...
|
Since restarting the GPU Operator didn’t help either, I decided to run some tests on the worker node.
And it worked, rapidly.
1
2
3
|
$ ffmpeg -y -hwaccel cuda -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
...
frame= 1259 fps=1249 q=19.0 Lsize= 9985kB time=00:00:41.83 bitrate=1955.3kbits/s speed=41.5x
|
So, it works on the node, but not on the pod. Time for some debugging.
Making the pod happy
Now with only the container yelling at me, I started to compare the happy and unhappy environments.
Node:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
jy@gwork01:~$ ldconfig -p | grep nvidia
libnvidia-ptxjitcompiler.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
libnvidia-ptxjitcompiler.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so
libnvidia-pkcs11.so.570.158.01 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-pkcs11.so.570.158.01
libnvidia-pkcs11-openssl3.so.570.158.01 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.570.158.01
libnvidia-opticalflow.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1
libnvidia-opticalflow.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-opticalflow.so
libnvidia-opencl.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-opencl.so.1
libnvidia-nvvm.so.4 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
libnvidia-nvvm.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-nvvm.so
libnvidia-ml.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-ml.so.1
libnvidia-ml.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-ml.so
libnvidia-encode.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-encode.so.1
libnvidia-encode.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-encode.so
libnvidia-cfg.so.1 (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-cfg.so.1
libnvidia-cfg.so (libc6,x86-64) => /lib/x86_64-linux-gnu/libnvidia-cfg.so
|
Container:
1
2
3
4
5
6
7
8
|
root@nvidia-test-8575cb6df5-zvvrb:/# ldconfig -p | grep nvidia
libnvidia-ptxjitcompiler.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
libnvidia-pkcs11.so.570.158.01 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11.so.570.158.01
libnvidia-pkcs11-openssl3.so.570.158.01 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.570.158.01
libnvidia-opencl.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
libnvidia-nvvm.so.4 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
libnvidia-ml.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
libnvidia-cfg.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1
|
Clearly, the libraries related to encoding were missing in the container, but I didn’t know why.
So I started some vigorous prompt engineering, google searches, and documentation reading, looking for the things that would make it happy. But like that cat at my friends’s house, no matter what I tried, it just wouldn’t give a damn.
That was until I stumbled across this magical catnip: running ffmpeg with nvenc inside nvidia docker.
Apparently, there is an environment variable on the container called NVIDIA_DRIVER_CAPABILITIES
that controls which libraries are made available inside the container, and it is set on the container running ffmpeg
. According to the official documentation of NVIDIA Container Toolkit: Specialized Configurations with Docker, the default value is compute,utility
, and video
needs to be added to it for the video encoding libraries to be available inside the container.
I double-checked that the environment variable NVIDIA_DRIVER_CAPABILITIES
was set in the container to the default value:
1
2
|
$ echo $NVIDIA_DRIVER_CAPABILITIES
compute,utility
|
and then updated my deployment:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
apiVersion: apps/v1
kind: Deployment
metadata:
name: nvidia-test
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: nvidia-test
template:
metadata:
labels:
app: nvidia-test
spec:
runtimeClassName: nvidia
containers:
- name: nvidia-test
image: nvidia/cuda:12.0.0-base-ubuntu22.04
imagePullPolicy: IfNotPresent
command:
- sleep
- "3600"
env:
- name: NVIDIA_DRIVER_CAPABILITIES
value: "compute,utility,video"
resources:
limits:
nvidia.com/gpu: "1"
restartPolicy: Always
|
After applying the changes, confirming the existence of encoding libraries in the container:
1
2
3
4
5
|
$ kubectl exec -it nvidia-test-774694874d-p4226 -- bash
# ldconfig -p | grep nvidia
...
libnvidia-encode.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1
...
|
this was the moment of truth:
1
|
ffmpeg -y -hwaccel cuda -i input.mp4 -vf format=yuv420p -c:v h264_nvenc output.mp4
|
And it worked!
ffmpeg happily told me how it went:
1
2
3
|
frame= 1415 fps=1247 q=34.0 Lsize= 5969kB time=00:00:23.59 bitrate=2072.3kbits/s speed=20.8x
video:5929kB audio:7kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.550812%
[aac @ 0x5ba88a060a00] Qavg: 65536.000
|
And I could even see the GPU activity from the worker node:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
$ nvidia-smi
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.158.01 Driver Version: 570.158.01 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 5060 Off | 00000000:01:00.0 Off | N/A |
| 0% 37C P1 34W / 145W | 168MiB / 8151MiB | 8% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 68572 C ffmpeg 159MiB |
+-----------------------------------------------------------------------------------------+
|
I’m not going to post all the details here, but I tried some container images other than nvidia ones, and they all worked when:
- The
NVIDIA_DRIVER_CAPABILITIES
environment variable was set to compute,utility,video
- The container was run with the
nvidia
runtime class
Conclusion
The fact that NVIDIA_DRIVER_CAPABILITIES
environment variable is set on the container running ffmpeg
was a bit counter-intuitive to me and it wasn’t well-documented, at least on the NVIDIA GPU Operator documentation. Maybe it’s a niche use case, maybe I overlooked some things, but it was a pain to figure out.
Once I learned how these things play together, though, it is pretty straightforward now to run hardware-accelerated transcoding in k3s with an NVIDIA GPU.