Hardware-Efficient Accurate 4-bit Multiplier for Xilinx 7 Series FPGAs
Misaki Kida, Shimpei Sato

TL;DR
This paper introduces a hardware-efficient 4-bit multiplier design for Xilinx 7-series FPGAs that reduces resource usage and delay, optimizing performance for low-bitwidth parallel operations in IoT and edge computing.
Contribution
It presents a novel 4-bit multiplier design that uses only 11 LUTs and two CARRY4 blocks, improving on prior designs by reducing LUT count and shortening critical path.
Findings
Uses 11 LUTs and 2 CARRY4 blocks
Achieves a critical path delay of 2.750 ns
Reduces LUT count compared to previous designs
Abstract
As IoT and edge inference proliferate,there is a growing need to simultaneously optimize area and delay in lookup-table (LUT)-based multipliers that implement large numbers of low-bitwidth operations in parallel. This paper proposes a hardwareefficientaccurate 4-bit multiplier design for AMD Xilinx 7-series FPGAs using only 11 LUTs and two CARRY4 blocks. By reorganizing the logic functions mapped to the LUTs, the proposed method reduces the LUT count by one compared with the prior 12-LUT design while also shortening the critical path. Evaluation confirms that the circuit attains minimal resource usage and a critical-path delay of 2.750 ns.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLow-power high-performance VLSI design · Embedded Systems Design Techniques · VLSI and FPGA Design Techniques
