TCIM: Triangle Counting Acceleration With Processing-In-MRAM Architecture
Xueyan Wang, Jianlei Yang, Yinglin Zhao, Yingjie Qi, Meichen Liu,, Xingzhou Cheng, Xiaotao Jia, Xiaoming Chen, Gang Qu, Weisheng Zhao

TL;DR
This paper introduces a novel in-memory triangle counting accelerator using processing-in-MRAM, significantly reducing data transfer bottlenecks and achieving substantial speedups and energy efficiency improvements over traditional GPU and FPGA solutions.
Contribution
The paper presents a new in-memory TC acceleration method using bitwise logic in STT-MRAM, with optimized data mapping techniques for enhanced performance and energy efficiency.
Findings
Achieves 9x speedup over GPU and 23.4x over FPGA
Reduces computation by 99.99% and memory writes by 72%
Improves energy efficiency by 20.6x over FPGA
Abstract
Triangle counting (TC) is a fundamental problem in graph analysis and has found numerous applications, which motivates many TC acceleration solutions in the traditional computing platforms like GPU and FPGA. However, these approaches suffer from the bandwidth bottleneck because TC calculation involves a large amount of data transfers. In this paper, we propose to overcome this challenge by designing a TC accelerator utilizing the emerging processing-in-MRAM (PIM) architecture. The true innovation behind our approach is a novel method to perform TC with bitwise logic operations (such as \texttt{AND}), instead of the traditional approaches such as matrix computations. This enables the efficient in-memory implementations of TC computation, which we demonstrate in this paper with computational Spin-Transfer Torque Magnetic RAM (STT-MRAM) arrays. Furthermore, we develop customized graph…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFerroelectric and Negative Capacitance Devices · Parallel Computing and Optimization Techniques · Advanced Memory and Neural Computing
