Efficient FPGA Implementation of Time-Domain Popcount for Low-Complexity Machine Learning
Shengyu Duan, Marcos L. L. Sartori, Rishad Shafik, Alex Yakovlev, Emre, Ozer

TL;DR
This paper introduces a novel FPGA-based time-domain popcount implementation that accelerates low-complexity machine learning algorithms like Tsetlin Machine, achieving significant reductions in latency, power, and resource use.
Contribution
It presents a new time-domain approach for popcount on FPGA, optimizing performance for Tsetlin Machine inference with asynchronous architecture compatibility.
Findings
Up to 38% latency reduction
43.1% power savings
15% resource utilization reduction
Abstract
Population count (popcount) is a crucial operation for many low-complexity machine learning (ML) algorithms, including Tsetlin Machine (TM)-a promising new ML method, particularly well-suited for solving classification tasks. The inference mechanism in TM consists of propositional logic-based structures within each class, followed by a majority voting scheme, which makes the classification decision. In TM, the voters are the outputs of Boolean clauses. The voting mechanism comprises two operations: popcount for each class and determining the class with the maximum vote by means of an argmax operation. While TMs offer a lightweight ML alternative, their performance is often limited by the high computational cost of popcount and comparison required to produce the argmax result. In this paper, we propose an innovative approach to accelerate and optimize these operations by performing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods
