BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual   architecture for hardware-efficient video inference

Van Thien Nguyen; William Guicquero; Gilles Sicard

arXiv:2501.14495·cs.CV·January 27, 2025

BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference

Van Thien Nguyen, William Guicquero, Gilles Sicard

PDF

TL;DR

BILLNET is a novel binarized Conv3D-LSTM architecture designed for hardware-efficient video inference, combining factorized convolutions and logic-gated residuals to reduce memory and computation while maintaining high accuracy.

Contribution

The paper introduces BILLNET, a compact binarized Conv3D-LSTM model with a novel factorization and residual architecture for resource-constrained hardware.

Findings

01

Achieves high accuracy on Jester dataset with low memory usage

02

Uses binarized weights and activations for efficiency

03

Employs multi-stage training for full quantization of LSTM layers

Abstract

Long Short-Term Memory (LSTM) and 3D convolution (Conv3D) show impressive results for many video-based applications but require large memory and intensive computing. Motivated by recent works on hardware-algorithmic co-design towards efficient inference, we propose a compact binarized Conv3D-LSTM model architecture called BILLNET, compatible with a highly resource-constrained hardware. Firstly, BILLNET proposes to factorize the costly standard Conv3D by two pointwise convolutions with a grouped convolution in-between. Secondly, BILLNET enables binarized weights and activations via a MUX-OR-gated residual architecture. Finally, to efficiently train BILLNET, we propose a multi-stage training strategy enabling to fully quantize LSTM layers. Results on Jester dataset show that our method can obtain high accuracy with extremely low memory and computational budgets compared to existing Conv3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · 3D Convolution · 1x1 Convolution · Convolution · Grouped Convolution