Triangle Counting Accelerations: From Algorithm to In-Memory Computing   Architecture

Xueyan Wang; Jianlei Yang; Yinglin Zhao; Xiaotao Jia; Rong Yin; Xuhang; Chen; Gang Qu; and Weisheng Zhao

arXiv:2112.00471·cs.AR·December 2, 2021

Triangle Counting Accelerations: From Algorithm to In-Memory Computing Architecture

Xueyan Wang, Jianlei Yang, Yinglin Zhao, Xiaotao Jia, Rong Yin, Xuhang, Chen, Gang Qu, and Weisheng Zhao

PDF

TL;DR

This paper presents a novel in-memory computing architecture for triangle counting in graphs, reformulating the problem with bitwise operations and leveraging STT-MRAM PIM technology to significantly outperform traditional GPU and FPGA accelerators in speed and energy efficiency.

Contribution

It introduces a co-optimized algorithm-architecture approach for triangle counting using processing-in-memory with STT-MRAM, achieving substantial performance and energy efficiency improvements.

Findings

01

Outperforms GPU by 12.2x in speed

02

Outperforms FPGA by 31.8x in speed

03

Achieves 34x energy efficiency over FPGA

Abstract

Triangles are the basic substructure of networks and triangle counting (TC) has been a fundamental graph computing problem in numerous fields such as social network analysis. Nevertheless, like other graph computing problems, due to the high memory-computation ratio and random memory access pattern, TC involves a large amount of data transfers thus suffers from the bandwidth bottleneck in the traditional Von-Neumann architecture. To overcome this challenge, in this paper, we propose to accelerate TC with the emerging processing-in-memory (PIM) architecture through an algorithm-architecture co-optimization manner. To enable the efficient in-memory implementations, we come up to reformulate TC with bitwise logic operations (such as AND), and develop customized graph compression and mapping techniques for efficient data flow management. With the emerging computational Spin-Transfer Torque…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.