CCSS: Hardware-Accelerated RTL Simulation with Fast Combinational Logic Computing and Sequential Logic Synchronization
Weigang Feng, Yijia Zhang, Zekun Wang, Zhengyang Wang, Yi Wang, Peijun Ma, Ningyi Xu

TL;DR
CCSS is a scalable hardware platform that significantly accelerates RTL simulation by optimizing combinational logic computation and sequential synchronization, reducing simulation time for complex chips.
Contribution
The paper introduces CCSS, a novel multi-core RTL simulation platform with specialized architecture and strategies for faster compilation and simulation, outperforming existing simulators.
Findings
Achieves up to 12.9x speedup over state-of-the-art simulators
Employs balanced DAG partitioning and efficient boolean cores
Uses low-latency NoC for synchronization
Abstract
As transistor counts in a single chip exceed tens of billions, the complexity of RTL-level simulation and verification has grown exponentially, often extending simulation campaigns to several months. In industry practice, RTL simulation is divided into two phases: functional debug and system validation. While system validation demands high simulation speed and is typically accelerated using FPGAs, functional debug relies on rapid compilation-rendering multi-core CPUs the primary choice. However, the limited simulation speed of CPUs has become a major bottleneck. To address this challenge, we propose CCSS, a scalable multi-core RTL simulation platform that achieves both fast compilation and high simulation throughput. CCSS accelerates combinational logic computation and sequential logic synchronization through specialized architecture and compilation strategies. It employs a balanced DAG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
