HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models

Xin Yan; Zhenglin Wan; Feiyang Ye; Xingrui Yu; Hangyu Du; Yang You; Ivor Tsang

arXiv:2602.13710·cs.LG·February 17, 2026

HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models

Xin Yan, Zhenglin Wan, Feiyang Ye, Xingrui Yu, Hangyu Du, Yang You, Ivor Tsang

PDF

Open Access

TL;DR

This paper introduces HBVLA, a novel binarization framework tailored for vision-language-action models, significantly reducing model size and computation while maintaining high performance for deployment on resource-limited platforms.

Contribution

We propose a policy-aware Hessian-based weight importance measure and a sparse orthogonal transform for effective 1-bit quantization of VLA models, addressing distribution gap issues.

Findings

01

Quantized models retain over 92% of full-precision performance.

02

HBVLA outperforms existing binarization methods in accuracy.

03

Demonstrates robust deployment on real-world robotic platforms.

Abstract

Vision-Language-Action (VLA) models enable instruction-following embodied control, but their large compute and memory footprints hinder deployment on resource-constrained robots and edge platforms. While reducing weights to 1-bit precision through binarization can greatly improve efficiency, existing methods fail to narrow the distribution gap between binarized and full-precision weights, causing quantization errors to accumulate under long-horizon closed-loop execution and severely degrade actions. To fill this gap, we propose HBVLA, a VLA-tailored binarization framework. First, we use a policy-aware enhanced Hessian to identify weights that are truly critical for action generation. Then, we employ a sparse orthogonal transform for non-salient weights to induce a low-entropy intermediate state. Finally, we quantize both salient and non-salient weights in the Harr domain with group-wise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Multimodal Machine Learning Applications · Advanced Memory and Neural Computing