Unprivileged root via a use-after-free in DRM GEM change_handle (CVE-2026-46215)

Reported to security@kernel.org, 12 April 2026 · Fixed in mainline (5e28b7b), May 2026, Cc: stable · Fixes 53096728 · Affected: v6.18-rc1 through the fix · CVE-2026-46215 (reported independently; credited to another researcher, see the disclosure section)

Summary

A use-after-free in the DRM GEM core ioctl DRM_IOCTL_GEM_CHANGE_HANDLE lets any local user with access to a render node escalate to root. drm_gem_change_handle_ioctl() moves a GEM object from one handle to another, but it never adjusts the object’s handle_count. For a short window the object has two IDR entries while its handle count still reads 1, and a concurrent DRM_IOCTL_GEM_CLOSE on the old handle drives that count to 0 and frees the object while the new handle is still pointing at it. The dangling handle is the use-after-free.

Both ioctls are flagged DRM_RENDER_ALLOW, so anyone who can open /dev/dri/renderD* can reach the bug. On every major desktop distribution systemd-logind grants the active session read/write access to that node by default, so this is reachable by an ordinary logged-in user with no special privileges. The chain I wrote takes the freed object, reclaims its slab slot with a sprayed pipe_buffer array, leaks a kernel pointer to defeat KASLR, sets PIPE_BUF_FLAG_CAN_MERGE to bypass the DirtyPipe fix, and overwrites a read-only /etc/passwd through the page cache. The result is passwordless root, around 99% of the time, from an unprivileged process.

Puttimet Thammasaeng reported this same bug and got there first, so the upstream Reported-by credit and CVE-2026-46215 are theirs. I found and reported it independently, and what follows is my own analysis and exploit. More on that in the disclosure section.

An unprivileged local attack surface

Most interesting kernel bugs need some privilege to reach. This one does not. DRM render nodes exist precisely so that unprivileged clients (your compositor, your browser’s GPU process, anything doing GPU work) can submit commands without going through the privileged primary node. systemd-logind hands the active console user an ACL on /dev/dri/renderD128 at login, and both of the ioctls in this race carry the DRM_RENDER_ALLOW flag, meaning they are explicitly allowed on the render node.

So the threat model is the strongest one a local bug can have: a normal user session on a stock desktop, no capabilities, no setuid helper, no container escape required. If you are logged in, you can open the node, and if you can open the node, you can run the race.

The change_handle ioctl

GEM (Graphics Execution Manager) objects are the buffer objects the DRM subsystem hands out to userspace. A process refers to them by integer handles stored in a per-file IDR (file_priv->object_idr). Each handle holds a reference to the object, tracked separately from the object’s kernel refcount by a field called handle_count. When the last handle goes away, handle_count hits 0 and the handle-count reference on the object is dropped, which is what eventually frees it.

The handle lifecycle is normally driven through a small set of helpers that keep handle_count and the IDR in step: drm_gem_handle_create_tail() to publish a handle (which calls drm_gem_object_handle_get() to bump the count), and drm_gem_handle_delete() to remove one (which calls drm_gem_object_release_handle() and drm_gem_object_handle_put_unlocked() to drop it). These helpers exist because getting the ordering wrong is easy and the consequences are exactly the kind of refcount bug you would expect.

DRM_IOCTL_GEM_CHANGE_HANDLE (ioctl nr 0xD2) is newer. It was added in v6.18-rc1 by commit 53096728b891 for AMD’s CRIU work, so that a checkpoint/restore can reassign a GEM object to a specific handle number. It does something none of the existing helpers do: it moves an object from one handle to another in place. And in doing so, it open-codes the handle manipulation instead of going through the lifecycle helpers.

The bug: a refcount that never moves

Stripped down, the ioctl looks up the object, inserts a new IDR entry, removes the old one, and drops its lookup reference:

/* drm_gem_change_handle_ioctl(), simplified, pre-fix */

obj = drm_gem_object_lookup(file_priv, args->handle);   /* +1 lookup ref */

spin_lock(&file_priv->table_lock);
idr_alloc(&file_priv->object_idr, obj, new_handle, ...); /* new entry */
spin_unlock(&file_priv->table_lock);

/* ... prime bookkeeping under prime.lock ... */

spin_lock(&file_priv->table_lock);
idr_remove(&file_priv->object_idr, args->handle);       /* old entry */
spin_unlock(&file_priv->table_lock);

drm_gem_object_put(obj);                                 /* -1 lookup ref */

It never calls drm_gem_object_handle_get() for the new handle, and never calls drm_gem_object_handle_put_unlocked() for the old one. handle_count stays at 1 across the whole operation. That is fine in isolation, since one handle goes away as another appears, so the count is genuinely still 1 at the end. The problem is the middle.

Between the idr_alloc and the idr_remove, the object has two live IDR entries, the old handle and the new one, while handle_count reads 1. The table_lock spinlock is dropped between the individual IDR operations, and nothing serializes this compound sequence against the compound sequence in drm_gem_handle_delete(). So a second thread can call GEM_CLOSE on the old handle inside that window. The close path removes its IDR entry and runs drm_gem_object_release_handle(), which decrements handle_count from 1 to 0, drops the handle-count reference, and frees the object. The new IDR entry is left pointing at freed memory.

It is worth being specific about the locks, because the gap between them is the whole bug. change_handle holds prime.lock across the operation but takes table_lock only around each individual IDR call, dropping it in between. drm_gem_handle_delete() takes table_lock for its idr_replace, then object_name_lock for the handle_count decrement. No single lock is held across both compound sequences, so they interleave freely, and the close can slot its free right in between change_handle’s two IDR operations.

The move leaves two IDR entries while handle_count is 1; a racing close frees the object out from under the new handle.

The freed object is a kmalloc-512 allocation (the GEM object backing virtio-gpu and nouveau buffers lands in that cache). On a 2-core VM the race wins consistently within about 100 iterations, and the calibration logic in the PoC measures the two ioctls’ solo latencies and staggers the threads so they collide on table_lock, which pushes the win rate up further.

Reclaiming the object

Once the object is freed but still reachable through the dangling handle, the next step is to put something useful in its place. The target is struct pipe_buffer, which also lands in kmalloc-512 when a pipe is grown: a pipe with eight slots holds an eight-element pipe_buffer array, and at 40 bytes per element that array sits in the 512-byte cache.

To make the reclaim reliable I first condition the cache with a msg_msg spray (allocate a few hundred 512-byte System V messages, then free a handful, LIFO, so the freed object’s slot is near the top of the freelist), then spray pipes whose buffer arrays drop into the hole. After that, the dangling GEM handle and a pipe_buffer array alias the same 512 bytes.

Leaking a pointer: the struct overlap

With the two structures laid over each other, the fields line up in a way that hands over both halves of the exploit. The numbers come from pahole and are stable across 6.18 to 7.0:

#define GEM_SIZE_OFF        216   /* drm_gem_object.size            */
#define GEM_NAME_OFF        224   /* drm_gem_object.name            */
#define PIPEBUF_SIZE_ACTUAL 40    /* sizeof(struct pipe_buffer)     */
#define OVERLAP_IDX         5
#define PIPEBUF_OPS_OFF     16    /* pipe_buffer.ops   -> 5*40+16 = 216 */
#define PIPEBUF_FLAGS_OFF   24    /* pipe_buffer.flags -> 5*40+24 = 224 */

The object’s size field at offset 216 sits exactly on top of pipe_buf[5].ops, and the object’s name field at offset 224 sits exactly on top of pipe_buf[5].flags. The ops field is a pointer to anon_pipe_buf_ops in the kernel’s .text, so reading it back gives a known kernel symbol and therefore the KASLR base. And the driver hands it to me for free: virtio-gpu’s RESOURCE_INFO ioctl (and nouveau’s GEM_INFO) returns the GEM object’s size field, which is now pipe_buf[5].ops.

size@216 lands on pipe_buf[5].ops (the leak); name@224 lands on pipe_buf[5].flags (the CAN_MERGE write).

Bypassing the DirtyPipe fix

DirtyPipe (CVE-2022-0847) abused a stale PIPE_BUF_FLAG_CAN_MERGE flag to write into the page cache of a read-only file. The fix made sure the flag is always initialized, so you can no longer find it set by accident. But here I am not relying on a stale flag, I am setting it on purpose through the overlap.

PIPE_BUF_FLAG_CAN_MERGE is 0x10, which is 16. GEM names are handed out as small sequential integers from an IDR. So if I pre-allocate names 1 through 15 with throwaway FLINKs, then FLINK the dangling handle, it gets name 16. That value is written into the object’s name field, which is pipe_buf[5].flags, so flags becomes 0x10 and CAN_MERGE is set on a live pipe buffer:

/* pre-stage names 1..15 so the dangling handle's FLINK gets name 16 */
for (i = 0; i < 15; i++) {
        h = create_gem_bo(fd);
        ioctl(fd, DRM_IOCTL_GEM_FLINK, &(struct drm_gem_flink){ .handle = h });
}

struct drm_gem_flink fl = { .handle = dangling };
ioctl(fd, DRM_IOCTL_GEM_FLINK, &fl);     /* fl.name == 16 == 0x10 */
/* writes 16 into gem.name @224 == pipe_buf[5].flags -> PIPE_BUF_FLAG_CAN_MERGE */

From there it is the DirtyPipe ending. I splice one byte of a read-only target file into the spray pipes so each one is anchored to a page-cache page, then write through them. The buffer with CAN_MERGE set merges my data into the cached page of the file, overwriting it without the page ever being marked dirty or written back through the filesystem permission checks.

The exploit chain

Race GEM_CHANGE_HANDLE against GEM_CLOSE to free the object while the new handle still references it.
Reclaim the kmalloc-512 slot with a sprayed pipe_buffer array (msg_msg feng shui, then splice-filled pipes).
Read size back through the driver info ioctl. It is pipe_buf[5].ops, a kernel .text pointer, which gives the KASLR base.
FLINK the dangling handle so its name (16 = 0x10) lands on pipe_buf[5].flags and sets PIPE_BUF_FLAG_CAN_MERGE.
Write through the spliced pipes to merge attacker data into the page cache of read-only /etc/passwd. The root line loses its password field.

The leak makes the chain deterministic rather than a guess: the address comes back at runtime, so KASLR-on works too, the value just changes per boot. The demo pins it with nokaslr only to make the output stable to read.

Evidence

The driver-agnostic KASAN trigger races the two ioctls and then closes the fd to dereference the dangling handle. On Linux 7.0-rc7 with CONFIG_KASAN=y under QEMU with virtio-gpu it reports:

BUG: KASAN: slab-use-after-free in drm_gem_object_release_handle+0x24/0x100
Read of size 8 at addr ffff888104769d60 by task kasan_trigger/75

Allocated by task 75:
  virtio_gpu_create_object -> __drm_gem_shmem_create ->
  virtio_gpu_mode_dumb_create -> drm_mode_create_dumb_ioctl
Freed by task 39:
  kfree -> virtio_gpu_dequeue_ctrl_func -> process_one_work

The buggy address belongs to the cache kmalloc-512 of size 512

The full chain runs without KASAN (the quarantine blocks the pipe-spray reclaim) and lands root reliably. Over 100 fresh boots it succeeded 99 times. The PoC retries internally: on a bad leak it re-races for a fresh dangling object instead of re-spraying a dead slot, up to 200 rounds of a few tens of milliseconds each. Most boots win on the first race. The roughly 1% failures are the inherent downside of a kernel race UAF, a losing interleaving that wedges the VM, recoverable by power-cycling.

[!] Race won (iter 977): handle=132049
[!] KASLR: pipe_buf_ops = 0xffffffff82428400
[!] FLINK: 16 = 0x10
[*] /etc/passwd:
    root::0:0:pwned:/root:/bin/sh
[!] LPE CONFIRMED

The fix

The shipped fix keeps change_handle mostly as it was, but stops the new handle from ever pointing at a live object across the racy window. It allocates the new slot, replaces it with NULL straight away, and only commits the real object once the prime bookkeeping is done. If a concurrent close got in first, it notices and backs out:

ret = idr_alloc(&file_priv->object_idr, obj, handle, handle + 1, GFP_NOWAIT);
if (ret < 0) { ... }

idrobj = idr_replace(&file_priv->object_idr, NULL, handle);
if (idrobj != obj) {
        /* a concurrent close already took this slot */
        idr_replace(&file_priv->object_idr, idrobj, handle);
        idr_remove(&file_priv->object_idr, args->new_handle);
        ret = -ENOENT;
        goto out_unlock;
}

What I suggested, and what shipped

The more interesting part for me was the gap between the fix I sent in and the one that actually landed, because they close the same bug in pretty different ways.

My instinct was to make change_handle stop being a special case. The whole reason the bug exists is that it open-codes handle movement instead of going through the lifecycle helpers, so my patch rebuilt it around them: take the handle-count reference with drm_gem_object_handle_get(), publish the new handle the create_tail way (idr_alloc(NULL), then idr_replace(obj) once it is fully set up), wire up the per-handle bits create_tail does (drm_vma_node_allow(), obj->funcs->open()), and tear the old handle down through drm_gem_handle_delete()‘s path. Basically, make it look like every other handle operation in the file. It works, but it is a lot of change for a small bug.

What David Francis and Dave Airlie shipped is smaller and, honestly, nicer. Instead of rebuilding the whole operation, they just close the window: the two-stage idr_replace above, so the new slot is never live across the prime work, plus a clean backout if a close races in. Their commit explains why the fuller create_tail machinery is not needed here, which is the bit I had not thought through. create_tail has to allocate its IDR slot as NULL first because something else can race it during setup. In change_handle the only thing that can race during the operation is drm_prime bookkeeping, and the ioctl already holds the prime lock the whole time, so it can take a shortcut create_tail cannot. Same reading of the bug, far less churn to fix it. I reached for the big hammer; they found the small one.

And then they went further than either fix. change_handle is being disabled outright in 7.1. It is a niche interface, added a few releases earlier for AMD’s CRIU/ROCm work, that had turned into a repeat offender for races, so rather than keep hardening it the maintainers just took it out. Removing the only caller is the most complete fix there is, and it is the right call.

Disclosure timeline

I sent the report to security@kernel.org on 12 April 2026 with the analysis, a KASAN trigger, the full LPE proof of concept, and a suggested fix. Willy Tarreau acknowledged it on 13 April and forwarded it to the DRM maintainers (Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann).

The fix was authored on 28 April by David Francis at AMD, with Dave Airlie, and went through more than one revision before the working version (most of it off-list, given the security angle). It landed in mainline and stable in late May 2026: upstream 5e28b7b94408, stable 672464dd5323, tagged Fixes: 53096728b891 and Cc’d to stable. It carries Reported-by: Puttimet Thammasaeng.

That credit is the honest bit to spell out: Puttimet reported the same bug and got there first, so the Reported-by and CVE-2026-46215 are theirs, fair and square. I found it independently and reported it on my own, and what I have to show for it is the analysis and the exploit chain in this post, which is plenty for me. Fixed versions are 6.18.32, 7.0.9, and 7.1-rc3 onward.

A broader pattern

The shape of this bug is a compound non-atomic operation on a refcounted object: two or more steps that must happen together (insert new reference, remove old reference, adjust the count) are split across separate lock acquisitions, and a concurrent teardown can interleave between them. It is a TOCTTOU on a refcount. The check (handle_count == 1, so one handle exists) is true at the start of the sequence, but the sequence itself creates a transient state (two IDR entries, count still 1) that makes the check’s assumption false, and nothing prevents another thread from acting on that transient state.

The kernel has this pattern everywhere. Any subsystem that tracks object ownership through a count and a lookup structure (an IDR, an XArray, an rbtree) has a set of helpers that keep the two in sync under the right locks: drm_gem_handle_create_tail and drm_gem_handle_delete in DRM, idr_alloc/idr_remove paired with kref_get/kref_put in generic code, fget/fput and fd_install/close_fd in the file descriptor table. The helpers exist because the compound operation is easy to get wrong, and they encode the ordering and locking discipline that makes it safe. Whenever a new operation open-codes the same manipulation instead of going through them, it is reimplementing that discipline from scratch, and any gap between steps is a window for a concurrent close to free the object out from under a live reference.

So the thing to grep for is not the bug itself but the precondition: a new ioctl, syscall, or callback that manipulates a refcounted object’s lookup entry and its count in separate steps, outside the helpers the subsystem already provides. change_handle is a textbook case. It took table_lock around each IDR call individually, dropped it in between, and never touched handle_count at all. Two IDR operations with the lock dropped between them was enough to free an object out from under a live handle and turn opening a render node into root.

Credit where it is due: to Puttimet for getting there first, to David Francis and Dave Airlie for a cleaner fix than mine and for deciding to retire the ioctl rather than keep patching it, and to the kernel security folks for the quick turnaround. The race is fixed in 6.18.32, 7.0.9 and 7.1-rc3, and change_handle is gone in 7.1. Proof of concept: github.com/0xCyberstan/CVE-2026-46215-POC.