BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen, William Guicquero, Gilles Sicard

TL;DR
BILLNET is a novel binarized Conv3D-LSTM architecture designed for hardware-efficient video inference, combining factorized convolutions and logic-gated residuals to reduce memory and computation while maintaining high accuracy.
Contribution
The paper introduces BILLNET, a compact binarized Conv3D-LSTM model with a novel factorization and residual architecture for resource-constrained hardware.
Findings
Achieves high accuracy on Jester dataset with low memory usage
Uses binarized weights and activations for efficiency
Employs multi-stage training for full quantization of LSTM layers
Abstract
Long Short-Term Memory (LSTM) and 3D convolution (Conv3D) show impressive results for many video-based applications but require large memory and intensive computing. Motivated by recent works on hardware-algorithmic co-design towards efficient inference, we propose a compact binarized Conv3D-LSTM model architecture called BILLNET, compatible with a highly resource-constrained hardware. Firstly, BILLNET proposes to factorize the costly standard Conv3D by two pointwise convolutions with a grouped convolution in-between. Secondly, BILLNET enables binarized weights and activations via a MUX-OR-gated residual architecture. Finally, to efficiently train BILLNET, we propose a multi-stage training strategy enabling to fully quantize LSTM layers. Results on Jester dataset show that our method can obtain high accuracy with extremely low memory and computational budgets compared to existing Conv3D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · 3D Convolution · 1x1 Convolution · Convolution · Grouped Convolution
