Culsans: An Efficient Snoop-based Coherency Unit for the CVA6 Open Source RISC-V application processor
Riccardo Tedeschi, Luca Valente, Gianmarco Ottavi, Enrico Zelioli,, Nils Wistoff, Massimiliano Giacometti, Abdul Basit Sajjad, Luca Benini,, Davide Rossi

TL;DR
This paper introduces Culsans, an open-source, efficient snoop-based cache coherency unit for CVA6 RISC-V cores, improving performance and area efficiency in multi-core embedded systems.
Contribution
It presents a lightweight, open-source cache coherency implementation using the MOESI protocol for CVA6 cores, optimized for embedded applications.
Findings
Up to 32.87% performance improvement in dual-core setups
Average 15.8% performance gain over OpenPiton
Cache Coherency Unit occupies only 1.6% of system area
Abstract
Symmetric Multi-Processing (SMP) based on cache coherency is crucial for high-end embedded systems like automotive applications. RISC-V is gaining traction, and open-source hardware (OSH) platforms offer solutions to issues such as IP costs and vendor dependency. Existing multi-core cache-coherent RISC-V platforms are complex and not efficient for small embedded core clusters. We propose an open-source SystemVerilog implementation of a lightweight snoop-based cache-coherent cluster of Linux-capable CVA6 cores. Our design uses the MOESI protocol via the Arm's AMBA ACE protocol. Evaluated with Splash-3 benchmarks, our solution shows up to 32.87% faster performance in a dual-core setup and an average improvement of 15.8% over OpenPiton. Synthesized using GF 22nm FDSOI technology, the Cache Coherency Unit occupies only 1.6% of the system area.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Distributed and Parallel Computing Systems
