TTP: A Hardware-Efficient Design for Precise Prefetching in Ray Tracing

Yavuz Selim Tozlu; Anshul Naithani; Huiyang Zhou

arXiv:2605.16253·cs.AR·May 18, 2026

TTP: A Hardware-Efficient Design for Precise Prefetching in Ray Tracing

Yavuz Selim Tozlu, Anshul Naithani, Huiyang Zhou

PDF

TL;DR

This paper introduces TTP, a hardware prefetcher designed to reduce memory latency in ray tracing by leveraging existing traversal stacks, resulting in significant speedups and high accuracy.

Contribution

The paper presents a novel hardware prefetcher, TTP, that uses existing traversal stacks for accurate prefetching in ray tracing, improving performance with minimal overhead.

Findings

01

Achieves 1.48x average speedup in ray tracing workloads.

02

Provides 98.92% average L1 accuracy in prefetching.

03

Reduces L1 cache misses by 31.54% compared to baseline.

Abstract

Ray tracing (RT) is a 3D graphics technique that offers highly realistic visuals. It is becoming prominent and accessible as GPU vendors have integrated dedicated ray tracing acceleration hardware. However, tracing millions of rays through 3D scenes consisting of high numbers of triangles in real time is challenging and requires expensive hardware. The main bottleneck in RT workloads is the expensive Bounding Volume Hierarchy (BVH) traversal task, which is a large tree structure that encodes the 3D scene. BVH traversal is a memory-bound problem, as the GPU threads spend most of their time reading tree node data from memory. In this work, we attack the memory latency bottleneck of ray tracing through prefetching. We propose a novel hardware prefetcher, named Tree Traversal Prefetcher (TTP), for ray tracing. The main idea is to leverage the existing tree traversal stack in the RT units…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.