TL;DR
CuRast introduces a CUDA-based software rasterizer capable of efficiently rendering billions of triangles without pre-constructed acceleration structures, outperforming Vulkan for dense, opaque meshes.
Contribution
It presents a novel 3-stage rasterization pipeline optimized for small triangles, enabling high-speed rendering of massive triangle datasets on GPUs.
Findings
CuRast achieves 2-5x faster rendering than Vulkan for large models.
It efficiently handles hundreds of millions of triangles in dense meshes.
Vulkan remains faster for low-poly meshes.
Abstract
Previous work shows that small triangles can be rasterized efficiently with compute shaders. Building on this insight, we explore how far this can be pushed for massive triangle datasets without the need to construct acceleration structures in advance. Method: A 3-stage rasterization pipeline first rasterizes small triangles directly in stage 1, using atomicMin to store the closest fragments. Larger triangles are forwarded to stages 2 and 3. Results: CuRast can render models with hundreds of millions of triangles up to 2-5x (unique) or up to 12x (instanced) faster than Vulkan. Vulkan remains an order of magnitude faster for low-poly meshes. Limitations: We currently focus on dense, opaque meshes that you would typically obtain from photogrammetry/3D reconstruction. Blending/Transparency is not yet supported, and scenes with thousands of low-poly meshes are not implemented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
