HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

Chang Sun; Zhiqiang Que; Bakhtiar Zadeh; Qibin Liu; Kevin H. Alvarez; Wayne Luk; Maria Spiropulu

arXiv:2604.22293·cs.AR·April 27, 2026

HGQ-LUT: Fast LUT-Aware Training and Efficient Architectures for DNN Inference

Chang Sun, Zhiqiang Que, Bakhtiar Zadeh, Qibin Liu, Kevin H. Alvarez, Wayne Luk, Maria Spiropulu

PDF

1 Repo

TL;DR

HGQ-LUT is a novel LUT-aware training approach for DNNs that significantly accelerates training and enhances hardware efficiency, enabling practical deployment on FPGA-based systems.

Contribution

It introduces a new LAT method with accelerator-efficient layers, automated accuracy-resource trade-off exploration, and integration into open-source tools for real-world FPGA deployment.

Findings

01

Achieves over 100x faster training on GPUs.

02

Provides state-of-the-art hardware efficiency for LUT-based DNNs.

03

Enables automated design and verification of hybrid architectures.

Abstract

Lookup-table (LUT) based neural networks can deliver ultra-low latency and excellent hardware efficiency on FPGAs by mapping arithmetic operations directly onto the logic primitives. However, state-of-the-art LUT-aware training (LAT) approaches remain difficult to use in practice: they are often orders of magnitude slower to train than conventional networks, require non-trivial manual tuning for hardware efficiency, and lack an end-to-end workflow. This work presents HGQ-LUT, integrated in https://github.com/calad0i/HGQ2, a new LAT approach that achieves state-of-the-art hardware efficiency while accelerating training by over 100 times on modern GPUs. HGQ-LUT introduces LUT-Dense and LUT-Conv layers that are implemented with regular, accelerator-efficient tensor operations during training, which are then compiled into logic LUTs for hardware. By combining these layers with fine-grained,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

calad0i/HGQ2
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.