VDCores: Resource Decoupled Programming and Execution for Asynchronous GPU

Zijian He; Adrian Sampson; Yiying Zhang; Zhiyuan Guo

arXiv:2605.03190·cs.DC·May 6, 2026

VDCores: Resource Decoupled Programming and Execution for Asynchronous GPU

Zijian He, Adrian Sampson, Yiying Zhang, Zhiyuan Guo

PDF

1 Repo

TL;DR

VDCores introduces a resource decoupling model for asynchronous GPUs, improving hardware utilization and decoding throughput while reducing programming effort, demonstrated on multiple GPU architectures.

Contribution

It proposes a novel decoupled programming and execution abstraction for asynchronous GPUs, enabling better resource utilization and performance.

Findings

01

Average decoding throughput improved by 24% across workloads.

02

Up to 77% throughput increase under dynamic inputs.

03

Kernel programming effort reduced by 90%.

Abstract

Modern GPUs increasingly rely on specialized and asynchronous hardware units to deliver high performance. Yet these units are often underutilized because today's GPU software stacks still organize programming and execution around a monolithic kernel model that mismatches asynchronous hardware. To address this issue, Virtual Decoupled Engines (VDCores) presents a new decoupled programming and execution model for asynchronous GPUs. VDCores abstracts asynchronous hardware execution units as resource isolated virtual cores and represents workloads as dependency-connected micro-operations (micro-ops). this abstraction removes static orchestration from the programmer, enables automatic overlap of memory and compute based on dependency and resource readiness, and thereby improves utilization of asynchronous hardware resources. Realizing such a decoupled abstraction efficiently on today's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vdcores/vdcores
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.