Generic and Universal Parallel Matrix Summation with a Flexible Compression Goal for Xilinx FPGAs
Thomas B. Preu{\ss}er

TL;DR
This paper presents a flexible, FPGA-agnostic bit matrix compressor design that improves efficiency for various applications like multiplication and dot product calculations, without needing a generator tool.
Contribution
It introduces a generic, FPGA-oriented bit matrix compressor with a systematic analysis, partial decomposition, and a heuristic construction method for flexible compression goals.
Findings
Provides FPGA-oriented metrics for elementary bit counters
Offers a systematic analysis of existing counters
Implements a heuristic for flexible compression matching device capabilities
Abstract
Bit matrix compression is a highly relevant operation in computer arithmetic. Essentially being a multi-operand addition, it is the key operation behind fast multiplication and many higher-level operations such as multiply-accumulate, the computation of the dot product or the implementation of FIR filters. Compressor implementations have been constantly evolving for greater efficiency both in general and in the context of concrete applications or specific implementation technologies. This paper is building on this history and describes a generic implementation of a bit matrix compressor for Xilinx FPGAs, which does not require a generator tool. It contributes FPGA-oriented metrics for the evaluation of elementary parallel bit counters, a systematic analysis and partial decomposition of previously proposed counters and a fully implemented construction heuristic with a flexible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
