Proxmox GPU passthrough sounds like magic. Take a physical graphics card, hand it directly to a virtual machine, and the VM runs with near-native performance: gaming VMs that feel real, Plex transcoding at hardware speed, AI workloads on consumer GPUs without dual-boot. The marketing version is clean.
The operational reality is the most hardware-dependent feature in the entire Proxmox stack. IOMMU groups that share a GPU with an unrelated PCIe device. Vendor reset bugs that brick the card until host reboot. NVIDIA Code 43 from older drivers. BIOS settings that don’t exist on the consumer board you bought. AMD Granite Ridge iGPUs that hang at “Loading Initial ramdisk.” Single-GPU systems where giving away the card means losing console access too.
This is a decision guide. It will not walk through the GUI clicks to enable passthrough. Proxmox documentation and dedicated tutorials handle that well. It covers the operational reality: when GPU passthrough earns its complexity budget, when the hardware lottery isn’t on your side, and which failure modes consistently appear in forum reports through 2026.
/dev/dri is usually the lower-friction choice that delivers the same hardware acceleration without VFIO complexity.
Best for most homelabs: Only attempt GPU passthrough if you have a dedicated secondary GPU for the host (or onboard graphics that work), modern hardware with verified IOMMU isolation, and tolerance for hardware-specific debugging.
GPU passthrough is worth it when:
- Plex/Jellyfin hardware transcoding at scale (5+ simultaneous transcodes)
- AI/ML model training or inference on a dedicated GPU
- Gaming VM with a discrete GPU separate from the host’s display
- CAD workstation virtualization on Quadro/Radeon Pro hardware
Should you attempt GPU passthrough? Decision matrix
| Scenario | Attempt passthrough? | Better alternative |
|---|---|---|
| Single-GPU homelab (no host backup video) | No | Use the GPU on the host, share via VirGL or LXC |
| Single-NIC mini PC with iGPU only | Risky | Pass through iGPU; keep SSH-only host management |
| Mini PC with iGPU + discrete GPU via OcuLink | Yes | Cleanest mini PC passthrough pattern |
| Tower with 2 GPUs (primary + passthrough) | Yes | Standard homelab pattern |
| Plex/Jellyfin transcoding only | Usually no | LXC with /dev/dri device passthrough is simpler |
| Multiple VMs sharing one GPU | No (use mediated) | NVIDIA vGPU on professional cards; not for consumer GPUs |
| AMD GPU on Granite Ridge or older Vega | Caution | Reset bug; budget setup time for workarounds |
| NVIDIA consumer 30/40-series to Windows VM | Yes | Drivers 465+ resolved the old Code 43 issue |
| NVIDIA RTX 5090 / Blackwell consumer | Caution | Active reset bug in 2025-2026; not yet resolved |
| HA cluster member with GPU-bound VM | Usually no | Mediated devices (vGPU) are the exception, not the rule |
This is the fast-scan version of the rest of this article. The reasoning behind each row follows below.
- GPU passthrough requires CPU + motherboard IOMMU support (Intel VT-d or AMD-Vi) and clean IOMMU group isolation; the latter is hardware lottery
- VFIO must claim the GPU at boot before host drivers (nvidia, nouveau, amdgpu) load
- AMD reset bug remains real in 2026; some generations need vendor-reset module, some need hookscripts, some Granite Ridge iGPUs still hang
- NVIDIA Code 43 was fixed in drivers 465+, but recent Blackwell consumer GPUs (RTX 5090, RTX PRO 6000) have introduced a new reset bug as of late 2025
- Single-GPU passthrough means giving up the host’s only display output; plan SSH or IPMI access first
- Full GPU passthrough and HA are mostly incompatible; mediated devices (NVIDIA vGPU) are the narrow exception, supported by Proxmox VE 8.4 live migration
- For Plex/Jellyfin transcoding alone, LXC with
/dev/driis simpler than full VFIO and delivers the same hardware acceleration
What Proxmox GPU passthrough actually does (and doesn’t)
GPU passthrough uses VFIO (Virtual Function I/O) to detach a physical PCIe device from the host kernel and bind it to a special driver that hands control to a virtual machine. The VM sees the GPU as if it were installed in a bare-metal system. There’s no emulation layer, no driver translation, no software bottleneck.
Three layers must align for it to work:
- Hardware layer: CPU with Intel VT-d or AMD-Vi support, motherboard with IOMMU enabled in BIOS, GPU in its own IOMMU group (or grouped only with its own audio function and bridges)
- Kernel layer: IOMMU enabled at boot, VFIO modules loaded, GPU drivers blacklisted on the host
- VM layer: q35 machine type, OVMF (UEFI) firmware preferred, PCIe passthrough configuration in
/etc/pve/qemu-server/<VMID>.conf
The official Proxmox VE documentation on PCI(e) passthrough is the authoritative reference for the mechanics.
What Proxmox GPU passthrough does not do:
- It does not share a GPU across multiple VMs. PCIe passthrough is exclusive. One GPU, one VM at a time. Sharing requires mediated device support (NVIDIA vGPU on professional cards), which is a different operational model covered briefly below.
- It does not work with Hyper-V or VMware Workstation on the same host. Once VFIO claims the GPU, it’s gone from the host’s perspective until reboot or explicit unbind.
- It does not survive most HA failover scenarios. A VM pinned to a host because of GPU hardware cannot fail over to a node without that exact hardware. Mediated devices are the narrow exception (see below).
- It does not eliminate the need for guest drivers. The VM still needs the correct NVIDIA/AMD/Intel drivers installed inside the guest OS, just like a bare-metal install would.
The mental model worth keeping: Proxmox GPU passthrough is hardware reassignment, not hardware sharing. Whatever VM owns the GPU owns it exclusively until released.
The complexity budget of Proxmox GPU passthrough
Of all the features in Proxmox, GPU passthrough has the largest hidden complexity budget. Setup costs vary wildly. Ongoing costs include:
- Hardware-specific debugging. Generic guides only get you so far. Specific GPU model, motherboard firmware, and CPU generation combinations have unique quirks documented across forum threads.
- BIOS firmware fragility. A motherboard firmware update can change IOMMU group layouts. Forum reports describe working passthrough setups breaking after routine BIOS updates.
- Kernel update fragility. Major kernel updates occasionally change VFIO behavior. The vendor-reset module needs DKMS to rebuild against new kernels; if it fails silently, the AMD reset bug returns without warning.
- Reset state management. AMD GPUs frequently can’t be cleanly released back to the host after VM shutdown. The workarounds (vendor-reset module, hookscripts that rebind drivers) themselves become moving parts in the system.
- No GUI console after passthrough. The Proxmox web interface cannot display the framebuffer of a passed-through GPU via NoVNC or SPICE. You need a real monitor connected to the GPU, or remote desktop into the guest VM.
This is the complexity budget framing applied to GPU passthrough: every benefit comes with ongoing operational cost. Where HA’s complexity is networking-driven and benefits from careful network design, passthrough’s complexity is hardware-driven, and hardware doesn’t get patched the way software does.
The honest question to ask before attempting passthrough: is the workload genuinely incapable of running on shared host resources, or am I attempting passthrough because it’s a homelab challenge?
Expected setup time
Complexity budget is a concept; setup time is its measurable manifestation. Community reports converge on these rough timeframes for different passthrough patterns:
| Scenario | Expected setup time |
|---|---|
Intel QuickSync in LXC container with /dev/dri | 15-30 minutes |
| Clean dual-GPU passthrough on supported hardware | 2-6 hours |
| AMD reset bug workaround setup (vendor-reset, hookscripts) | 1-3 days |
| Single-GPU gaming passthrough (host driver unbind/rebind cycles) | Recurring weekend project |
| Mediated devices (NVIDIA vGPU on professional cards) | Days, plus licensing process |
These ranges assume the operator has Linux administration experience and reads through Proxmox documentation properly. Time spent debugging hardware-specific quirks (motherboard BIOS oddities, IOMMU group exceptions) is not included; it can multiply these numbers significantly.
The pattern worth noticing: the cheapest path operationally (LXC /dev/dri) is also the cheapest in time. The most ambitious path (single-GPU gaming) never really completes; it becomes ongoing maintenance.
Proxmox GPU passthrough vs LXC /dev/dri
For many homelab workloads marketed as “GPU passthrough scenarios,” full VFIO passthrough is operational overkill. An LXC container with /dev/dri device passthrough delivers the same hardware acceleration with significantly less complexity.
The comparison that matters:
| Aspect | Full VFIO passthrough | LXC with /dev/dri |
|---|---|---|
| Setup complexity | High (IOMMU, VFIO, blacklists, kernel params) | Low (add device line to container config) |
| GPU exclusivity | One VM, exclusive | Shared across host and multiple containers |
| Reset bug exposure | High (AMD, recent Blackwell) | None (host driver handles state) |
| Console access trade-off | Lose host display | Host keeps display |
| Guest OS flexibility | Any (Linux, Windows, BSD) | Linux only (Plex, Jellyfin, Frigate in containers) |
| HA compatibility | Largely incompatible | Compatible (container migration possible) |
| Hardware acceleration use cases | Gaming, AI/ML, CAD, exclusive workloads | Transcoding, AI inference, video pipelines |
| Setup time | Hours to weeks | Minutes |
The decision rule of thumb: if your workload doesn’t need the entire GPU to itself, doesn’t need a Windows guest, and doesn’t need DirectX-class API access, LXC /dev/dri is almost always the better path.
Typical LXC /dev/dri use cases:
- Plex Media Server transcoding (Intel QuickSync or NVIDIA NVENC)
- Jellyfin transcoding
- Frigate AI camera detection
- Local LLM inference with smaller models (when not needing dedicated VRAM)
- Hardware-accelerated video processing pipelines
When LXC is not enough:
- Gaming VMs (need Windows + DirectX + exclusive GPU)
- Heavy CUDA workloads requiring full VRAM allocation
- CAD workstations with professional graphics drivers
- Anti-cheat-protected games detecting container environments
- Workloads needing PCIe-level GPU features (specific NVIDIA enterprise features)
For most homelab Plex/Jellyfin setups, the LXC route saves days of configuration and eliminates entire categories of failure. Operators who default to full VFIO passthrough for transcoding workloads often discover after weeks of debugging that LXC would have worked the first time.
IOMMU groups: the hardware lottery
IOMMU (Input-Output Memory Management Unit) is the hardware feature that isolates PCIe devices from each other for memory access. It groups devices that share underlying buses, root complexes, or PCIe switches. The Proxmox VE wiki explicitly states the constraint: a device can only be passed through if it’s in its own IOMMU group, or if all devices in the group are passed through together.
This is where most passthrough attempts fail before they even start.
Why IOMMU grouping matters
Each PCIe device sits on a chain: device → bridge → root port → CPU. When ACS (Access Control Services) is supported and enabled, the IOMMU can isolate each device individually. When ACS is missing or disabled, the IOMMU treats the whole branch as one inseparable group.
Consumer motherboards frequently bundle multiple devices into one IOMMU group:
- A GPU grouped with the motherboard’s primary network controller
- A GPU grouped with the chipset’s USB hub
- Multiple PCIe slots sharing one root port and therefore one group
- A discrete GPU’s audio function grouped separately from the GPU itself
If your GPU shares a group with your network card, passing through the GPU means passing through the network card too, taking down host networking. Selective passthrough becomes impossible.
How to check IOMMU groups
The canonical command is:
find /sys/kernel/iommu_groups/ -type l | sort -VThis lists every device by IOMMU group. A clean passthrough candidate looks like this:
/sys/kernel/iommu_groups/14/devices/0000:01:00.0 GPU
/sys/kernel/iommu_groups/14/devices/0000:01:00.1 GPU audioGPU and audio in one group is normal and acceptable; they’re functions of the same physical card and pass through together as expected.
A blocked passthrough candidate looks like this:
/sys/kernel/iommu_groups/14/devices/0000:01:00.0 GPU
/sys/kernel/iommu_groups/14/devices/0000:01:00.1 GPU audio
/sys/kernel/iommu_groups/14/devices/0000:02:00.0 Ethernet controller
/sys/kernel/iommu_groups/14/devices/0000:03:00.0 SATA controllerIf your GPU shares an IOMMU group with non-GPU devices, you have three options, none of them good:
- Move the GPU to a different PCIe slot. Sometimes a different physical slot lands in a different group. Hit or miss.
- Check for a BIOS update. Some vendors fix IOMMU grouping in firmware updates.
- Use the ACS override patch. This is a kernel patch that fakes ACS support and forces device-level isolation. It works but weakens IOMMU security guarantees, acceptable for a homelab, not acceptable for any setup where guest VMs are not fully trusted.
Reported pattern: AMD Ryzen mini PCs vs Intel N-series
A pattern that appears consistently across community reports through 2026: AMD Ryzen Zen 4 mini PCs (Beelink SER series, Minisforum UM790, GMKtec K-series) tend to have cleaner IOMMU group separation than Intel N-series (N100/N150/N305) mini PCs. The Radeon 780M iGPU on Ryzen typically lands in its own group or shares only with closely related devices. The Intel iGPU on N-series often shares groups with chipset devices, making passthrough partial or impossible without ACS override.
This is hardware-specific behavior, not a vendor preference. Always verify IOMMU groups on your specific motherboard before assuming passthrough will work. Treat IOMMU group layout as a hardware purchase criterion, not a software configuration problem.
Hardware and BIOS prerequisites
Before touching Proxmox configuration, verify the hardware foundation:
CPU and chipset
- Intel: CPU with VT-d support (most Core i5/i7/i9 from 6th gen onwards, all Xeon, recent Pentium/Celeron). Enable VT-d (sometimes called “Intel Virtualization for Directed I/O”) in BIOS.
- AMD: CPU with AMD-Vi (all Ryzen, Threadripper, EPYC). Enable IOMMU in BIOS (sometimes auto-enabled, sometimes hidden under “Advanced” or “NB Configuration”).
BIOS settings checklist
These settings vary by manufacturer, but the following items consistently matter:
- IOMMU / VT-d / AMD-Vi: Enabled
- SR-IOV: Enabled (if available)
- Above 4G Decoding: Enabled (required for modern GPUs)
- Resizable BAR: Enabled (newer GPUs)
- Secure Boot: Disabled, at least during setup (can re-enable later with extra configuration)
- CSM (Legacy boot): Disabled, UEFI-only
Verify IOMMU is actually active
After BIOS configuration and Proxmox boot, check kernel messages:
dmesg | grep -e DMAR -e IOMMU -e AMD-ViIntel should show DMAR: IOMMU enabled. AMD should show AMD-Vi: Supported feature. No output means IOMMU is not active. Return to BIOS, do not proceed.
Also verify interrupt remapping:
dmesg | grep -i 'interrupt remapping'Should show DMAR-IR: Enabled IRQ remapping (Intel) or AMD-Vi: Interrupt remapping enabled (AMD). Without interrupt remapping, passthrough fails with “Operation not permitted” or “Interrupt Remapping hardware not found.” Hardware without interrupt remapping can be forced with allow_unsafe_interrupts=1, but this is a security compromise documented as such by Proxmox.
The vendor-specific reality
NVIDIA: old Code 43 solved, new Blackwell reset bug emerged
The notorious Code 43 problem, where NVIDIA consumer drivers detected virtualization and refused to initialize, was effectively resolved in NVIDIA drivers version 465 and later. Modern guides showing hypervisor-hiding tricks (kvm=off, hv_vendor_id=proxmox) are leftover from 2020-2022 era documentation. If you’re on a current NVIDIA driver with RTX 30-series or 40-series consumer hardware, you likely don’t need any of it.
What still matters for NVIDIA passthrough:
- Use OVMF (UEFI) firmware on the VM when possible. OVMF gives best compatibility for modern GPUs, but if your GPU ROM is not UEFI-capable, SeaBIOS may still be required.
- Use q35 machine type
- Use PCIe (not legacy PCI) in the hostpci configuration
- Pass the GPU with
pcie=1and the audio function on a separate hostpci entry
In September 2025, Tom’s Hardware reported a reproducible virtualization reset bug affecting NVIDIA RTX 5090 and RTX PRO 6000 cards (Blackwell architecture). CloudRift, a GPU cloud provider that encountered the issue in production, issued a $1,000 public bug bounty for a fix. Multiple Proxmox forum threads from late 2025 documented the pattern: after a guest VM shutdown, the card enters an unresponsive state where it does not respond to PCI reset, requiring a full host reboot.
A February 2026 forum thread documented the same issue on RTX 5080 (Blackwell GB203) under Proxmox VE 9.1 / kernel 6.17, with QEMU exiting immediately on VM start due to “Inappropriate ioctl for device” errors on the PCI reset attempt. No issues were reported on older cards like RTX 4090, suggesting the bug is generation-specific to Blackwell consumer GPUs.
The practical takeaway: if you’re buying new NVIDIA hardware for passthrough in 2026, RTX 40-series remains the safer choice. Blackwell consumer hardware is in the same operational position as AMD GPUs around 2020: it works sometimes, with workarounds, with notable failure modes documented in active forum threads.
AMD: the reset bug is still a real concern
The AMD reset bug is the most-discussed problem in GPU passthrough across years of forum activity. The mechanism: AMD GPUs require complex vendor-specific reset sequences involving firmware communication and power state transitions. Standard PCI reset methods (FLR, bus reset) are insufficient. When a VM shuts down or reboots, the GPU often fails to reset, leaving it in an unusable state until host reboot.
The state of workarounds in 2026:
- vendor-reset kernel module (lowell80/vendor-reset, originally gnif/vendor-reset) provides vendor-specific reset capabilities for affected AMD GPUs. Installation requires DKMS so the module rebuilds against kernel updates.
- Hookscripts that explicitly unbind and rebind the GPU around VM start/stop cycles. Forum threads document evolving scripts that handle edge cases (guest-initiated shutdown vs host-initiated stop) differently.
- Newer AMD generations (RX 9000 series Navi 48) appear less affected, but reports vary by exact card and motherboard.
A September 2025 forum thread documented Granite Ridge iGPU (Ryzen 9000 series) passthrough hanging at “Loading Initial ramdisk” even with vendor-reset loaded. The operator noted: “the AMD reset bug is never fully eradicated. I still encounter a situation roughly every 3-5 VM reboots where a full host reboot is required to get the GPU to reinitialize correctly.”
The honest assessment: AMD GPU passthrough works, but expect to invest setup time in workarounds and to occasionally reboot the host when reset failures stack up. For a learning lab, acceptable. For a workload that must be always-available, the picture is more nuanced than it used to be: NVIDIA RTX 40-series is still the safer choice, but recent Blackwell hardware has shifted the calculus.
Intel: iGPU passthrough for transcoding
Intel iGPU passthrough is the simplest case for one specific use: hardware transcoding for Plex/Jellyfin. The pattern:
- Pass through the Intel iGPU (typically
0000:00:02.0) - Run the transcoding workload in a VM with QuickSync access
- Keep host management via SSH (no console display needed)
This works on most modern Intel platforms with no significant reset bug issues. The catch: if the Intel iGPU is your only video output, you lose console access when it’s passed through. Plan SSH or IPMI access before attempting.
For Plex/Jellyfin specifically, the LXC /dev/dri route covered above is usually the cleaner alternative to full VFIO passthrough.
What changed in Proxmox VE 8.4 and 9.x
The cluster GPU story shifted meaningfully in 2025-2026. Proxmox VE 8.4 (released April 2025) introduced live migration support for VMs using mediated devices (NVIDIA vGPU), provided both source and destination nodes have identical hardware and driver support. A helper tool, pve-nvidia-vgpu-helper, was added to simplify NVIDIA vGPU driver setup.
What this means practically:
- Full PCIe passthrough is still mostly incompatible with HA. A VM with full GPU passthrough is still pinned to its host. This hasn’t changed.
- Mediated devices (NVIDIA vGPU) are the exception. With Proxmox VE 8.4+, a VM using vGPU can live-migrate between cluster nodes if the destination has the same NVIDIA GPU hardware, the same vGPU profile, and the same driver version. The Proxmox release notes state: “Currently, only NVIDIA GPUs are known to support live migration” in this context.
- Cluster device mapping (added in earlier 8.x releases) provides cluster-wide resource definitions for PCI devices, simplifying the configuration of mediated devices across multiple nodes.
- Practical limitation: mediated device support requires NVIDIA professional cards (V100, A100, T4, L40, etc.) and a vGPU subscription license from NVIDIA. Consumer cards (RTX 4090, RTX 5090) do not support vGPU.
For homelab operators, this changes very little: consumer GPU passthrough remains incompatible with HA. For small business deployments considering enterprise GPU virtualization, Proxmox 8.4+ has closed a significant gap with VMware vSphere capabilities in the vGPU space.
When Proxmox GPU passthrough is the wrong choice
Proxmox GPU passthrough is enthusiastically over-recommended. Be honest about whether your situation fits:
- Single-GPU systems with no fallback video. Passing through your only GPU means losing host console. If anything breaks the VM, you may need to plug in a USB-to-serial adapter or use IPMI to recover. Recovery friction is real.
- Workloads that share well. Plex transcoding, Jellyfin transcoding, Frigate AI detection often run perfectly well in LXC containers with
/dev/dridevice passthrough. Full VFIO is overkill for hardware acceleration that doesn’t need exclusive GPU access. - HA cluster members (with full passthrough). GPU passthrough binds a VM to specific hardware. That VM cannot fail over to a node without identical GPU hardware in the same IOMMU layout. The exception is mediated devices (vGPU), which require professional-grade NVIDIA hardware and licensing.
- Plans to share a GPU across multiple VMs (with consumer hardware). PCIe passthrough is exclusive. Sharing requires vGPU on professional cards with subscription licensing, not a homelab afternoon project.
- Hardware without verified IOMMU isolation. Some consumer motherboards bundle the GPU with critical chipset devices in one IOMMU group. ACS override can work around it, but knowingly weakening IOMMU isolation isn’t free.
- Gaming on a host without two physical GPUs. Single-GPU passthrough for gaming (where the same card is used by host on boot, then handed to VM) is possible but requires script-driven driver unbind/rebind cycles. It works for hobbyists. It does not work for operators who want a system that “just runs.”
- Recent Blackwell consumer GPUs (RTX 5090, 5080, RTX PRO 6000). Active reset bug as of early 2026; not yet resolved upstream.
The honest rule of thumb: Proxmox GPU passthrough earns its complexity when the workload requires exclusive GPU access AND your hardware verifiably supports clean isolation. Most homelab GPU use cases don’t actually need passthrough; they need hardware acceleration, which often has simpler paths.
Patterns experienced operators avoid
Across years of forum threads and community discussions, certain cautions appear repeatedly from operators who have already learned what fails in production. These are not absolute prohibitions; they are the patterns that consistently end badly enough to be worth repeating.
- Buying hardware before checking IOMMU group reports. Verifying IOMMU isolation on the exact motherboard you intend to use, ideally before purchase, prevents weeks of frustration on a card that physically can’t be isolated.
- Building HA expectations around consumer GPUs. Consumer cards do not support live migration even with mediated devices. Any HA design that assumes “we’ll just migrate the GPU VM” on consumer hardware is built on a false premise.
- Running single-GPU gaming hosts as if they were production. The host driver unbind/rebind cycle works for hobbyists but produces edge cases that no operator wants in a production-grade setup. Console recovery scenarios surface at the worst moments.
- Mixing experimental passthrough with critical workloads on the same host. A VM that owns the GPU can crash the host through a reset bug; if that host also runs your NAS or backup target, one bad VM shutdown takes down the whole stack.
- Assuming forum success equals reproducible stability. A working configuration posted on Reddit or the Proxmox forums is a snapshot from one operator with one hardware combination on one kernel version. Reproducing it on different hardware often fails in non-obvious ways. Treat forum success as encouragement, not as proof.
- Skipping the IPMI or backup console plan. Single-GPU passthrough without alternative console access is the configuration that bites every operator at least once. Plan recovery before you need it.
The common thread: passthrough setups age poorly without explicit anticipation of failure modes. Operators who survive long-term tend to design for the bad day, not the demo.
Common Proxmox GPU passthrough failure modes
These patterns consistently break passthrough setups in community reports.
- Check
lspci -nnkand confirm the GPU’s “Kernel driver in use” field - Verify vendor/device IDs in
/etc/modprobe.d/vfio.confmatch your GPU’s actual IDs - Confirm host GPU drivers (nvidia, nouveau, amdgpu, radeon) are blacklisted in
/etc/modprobe.d/blacklist.conf - Verify VFIO modules load early via
softdepconfiguration before host GPU drivers - Update initramfs (
update-initramfs -u -k all) and reboot
- Confirm OVMF (UEFI) firmware is selected when the GPU ROM supports UEFI; otherwise try SeaBIOS
- Confirm
vga: nonein VM config when GPU is primary - For NVIDIA, ensure ROM file is correct or omit ROM specification entirely (modern drivers don’t need it)
- Check IOMMU group; all devices in the group must be passed through together
- Review
/var/log/syslogandjournalctl -kfor VFIO binding errors
- Confirm vendor-reset module is loaded:
lsmod | grep vendor - Check reset method active:
cat /sys/bus/pci/devices/0000:XX:00.0/reset_method - Verify hookscript (if used) executes on both VM start and stop
- Check kernel logs during VM stop for “AMD_VEGA10” or similar vendor-reset entries
- As a last resort, reboot the host to restore GPU state
Coverage scope
This article reflects Proxmox VE 8.x and 9.x behavior, kernel 6.8 / 6.11 / 6.14 passthrough patterns, and community discussions through Q2 2026. Specific operational claims (NVIDIA driver behavior, AMD reset bug status, Blackwell reset reports, IOMMU group patterns) are based on Proxmox VE official documentation and forum discussions cited above. GPU passthrough is the most hardware-specific topic in Proxmox; outcomes for your specific GPU + motherboard + kernel combination may differ from generalized patterns described here.
Detailed VFIO troubleshooting for specific GPU models, advanced hookscript engineering, and vGPU partitioning workflows are out of scope for this article.
FAQ
Do I need a dedicated GPU for the Proxmox host?
Strongly recommended, not strictly required.
For single-GPU passthrough scenarios, if you pass through your only GPU, the host has no console output. You can manage Proxmox via SSH or the web interface, but if anything breaks the VM or VFIO binding, recovery requires alternative console access (IPMI, USB-to-serial, or a temporary GPU swap). A low-end secondary GPU or working iGPU fallback usually pays for itself on the first bad recovery day.
Can I share one GPU across multiple VMs?
Not with full passthrough.
PCIe passthrough is exclusive: one GPU per VM at a time. Sharing requires mediated device support, which on NVIDIA means professional cards (V100, A100, T4, L40, etc.) plus a vGPU subscription license. As of Proxmox VE 8.4 (April 2025), mediated devices also support live migration between cluster nodes. On consumer GPUs (RTX 4090, RTX 5090), sharing is not available. The practical homelab answer: buy more GPUs, use LXC containers with /dev/dri for shared hardware acceleration, or pick workloads that don’t need exclusive GPU access.
Does GPU passthrough work with Proxmox HA?
Not in the way most homelabs imagine.
A VM with full GPU passthrough is pinned to its host because the target hardware doesn’t exist on other cluster nodes. The narrow exception is mediated devices: as of Proxmox VE 8.4, VMs using NVIDIA vGPU can live-migrate within a cluster when destination nodes have identical hardware and driver support. This requires professional NVIDIA cards and vGPU licensing, not a typical homelab setup, but a real option for small business deployments where this matters.
Is the AMD reset bug fixed in 2026?
Partially, but the picture is more nuanced than it used to be.
Newer AMD generations have better reset behavior than older Vega/Polaris cards, and the vendor-reset module covers many affected models. However, Granite Ridge iGPUs (Ryzen 9000 series) and some specific cards still exhibit reset failures requiring host reboot. Community consensus: AMD GPU passthrough is workable but expect to invest setup time in workarounds. The newer twist is that recent NVIDIA Blackwell consumer cards (RTX 5090, 5080, RTX PRO 6000) have introduced their own reset bug, so the historical “NVIDIA is safer” narrative now needs hardware-specific qualification.
Is NVIDIA Code 43 still a problem?
For drivers version 465 and later on RTX 30/40-series consumer cards, no. The hypervisor-detection workarounds (kvm=off, hv_vendor_id tricks) documented in 2020-2022 era guides are largely unnecessary on modern drivers. Note that this fix is about Code 43 specifically; it does not address the separate Blackwell reset bug emerging on RTX 5090 / 5080 / RTX PRO 6000 in 2025-2026.
Can I do single-GPU passthrough for gaming?
Possible, but operationally fragile.
It requires hookscripts that unbind the GPU from host drivers before VM start and rebind after VM stop. The setup is well-documented but operationally fragile: driver state issues, console recovery scenarios, and reset bug interactions all become routine concerns. For a hobbyist setup focused on learning, fine. For a system that needs to “just work,” consider a second physical GPU or stick to LXC for workloads that fit there.
Is LXC with /dev/dri easier than full GPU passthrough?
Yes, by a wide margin.
For Plex, Jellyfin, Frigate, and other workloads that need hardware acceleration but not exclusive GPU access, an LXC container with /dev/dri device passthrough requires no IOMMU configuration, no VFIO binding, no kernel parameters, and no reset bug concerns. The container shares the host’s GPU drivers and accesses hardware acceleration through the standard /dev/dri device interface. Many homelab use cases that get marketed as “GPU passthrough scenarios” are actually better served by LXC.
What’s the minimum hardware for reliable GPU passthrough?
Practical minimums: a CPU with documented IOMMU support (Ryzen 5000+ or Intel 10th gen+), a motherboard with verified clean IOMMU group separation for the PCIe slot you’ll use, a discrete GPU with current vendor drivers (preferably RTX 30/40-series for NVIDIA at this time), and either a secondary GPU for the host or accepted SSH-only management. Below this baseline, expect to spend more time fighting configuration than using the GPU.
Can I use GPU passthrough for AI/ML workloads?
One of the strongest use cases.
A consumer GPU (RTX 4070/4080/4090) passed through to a Linux VM running PyTorch, TensorFlow, or local LLM inference works well and delivers near-native performance. For AI workloads, storage choice matters too; see Proxmox Storage: ZFS, LVM, Ceph Decisions for how dataset locality affects training throughput. RTX 5090 is technically capable but currently affected by the Blackwell reset bug, so it’s a less reliable choice in early 2026. Be aware that the VM holding the GPU cannot be migrated under full passthrough, and Plex/transcoding workloads needing GPU access at the same time require a second card.
Final thoughts
Proxmox GPU passthrough is the most hardware-dependent feature in the entire stack. Done right, Proxmox GPU passthrough transforms a homelab into a flexible compute platform: gaming, AI workloads, professional graphics, and dense Plex transcoding all running on one box. Done wrong, it consumes weekends in debugging IOMMU groups, fighting reset bugs, and recovering from console-loss scenarios.
For the operator deciding whether to attempt passthrough, the questions worth answering honestly:
- Does this workload genuinely need exclusive GPU access, or does it just need hardware acceleration LXC can deliver?
- Do I have verified IOMMU group isolation for the GPU on my specific motherboard?
- Do I have a host video fallback (secondary GPU, onboard graphics, IPMI) in case the passthrough breaks?
- Is my GPU on a generation with active known issues (recent Blackwell, older AMD Vega)?
- Am I willing to invest setup time in vendor-specific workarounds (vendor-reset for AMD, OVMF tuning for NVIDIA)?
If those answers align favorably, GPU passthrough is one of the most rewarding capabilities Proxmox offers. If they don’t, simpler patterns (LXC with /dev/dri, dedicated workload boxes, or workloads that don’t need GPU acceleration) produce better real-world reliability than partial passthrough implementations.
The reliable Proxmox setups are the ones where operators chose features that matched their hardware. Proxmox GPU passthrough rewards homework done before the hardware was purchased.