Warp-STAR: High-performance, Differentiable GPU-Accelerated Static Timing Analysis through Warp-oriented Parallel Orchestration

En-Ming Huang; Shih-Hao Hung

arXiv:2603.28381·cs.DC·March 31, 2026

Warp-STAR: High-performance, Differentiable GPU-Accelerated Static Timing Analysis through Warp-oriented Parallel Orchestration

En-Ming Huang, Shih-Hao Hung

PDF

TL;DR

Warp-STAR is a GPU-accelerated static timing analysis engine that uses warp-level orchestration to eliminate load imbalance, achieving significant speedups in EDA workflows.

Contribution

It introduces warp-oriented parallel orchestration to improve GPU-based static timing analysis efficiency, surpassing previous state-of-the-art methods.

Findings

01

2.4X speedup over previous GPU-based STA

02

1.7X speedup in global placement framework

03

Effective differentiable gradient analysis with minimal overhead

Abstract

Static timing analysis (STA) is crucial for Electronic Design Automation (EDA) flows but remains a computational bottleneck. While existing GPU-based STA engines are faster than CPU, they suffer from inefficiencies, particularly intra-warp load imbalance caused by irregular circuit graphs. This paper introduces Warp-STAR, a novel GPU-accelerated STA engine that eliminates this imbalance by orchestrating parallel computations at the warp level. This approach achieves a 2.4X speedup over previous state-of-the-art (SoTA) GPU-based STA. When integrated into a timing-driven global placement framework, Warp-STAR delivers a 1.7X speedup over SoTA frameworks. The method also proves effective for differentiable gradient analysis with minimal overhead.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.