LQA: A Lightweight Quantized-Adaptive Framework for Vision-Language Models on the Edge
Xin Wang, Hong Jia, Hualin Zhou, Sheng Guang Wang, Yu Zhang, Ting Dang, Tao Gu

TL;DR
LQA is a lightweight, quantized framework that enables efficient and robust deployment of vision-language models on edge devices by combining modality-aware quantization with gradient-free test-time adaptation.
Contribution
It introduces Selective Hybrid Quantization and a gradient-free adaptation mechanism, making VLMs more resource-efficient and robust against distribution shifts on edge hardware.
Findings
Improves adaptation performance by 4.5% across datasets.
Reduces memory usage by up to 19.9 times compared to full-precision models.
Outperforms gradient-based TTA methods in resource-constrained environments.
Abstract
Deploying Vision-Language Models (VLMs) on edge devices is challenged by resource constraints and performance degradation under distribution shifts. While test-time adaptation (TTA) can counteract such shifts, existing methods are too resource-intensive for on-device deployment. To address this challenge, we propose LQA, a lightweight, quantized-adaptive framework for VLMs that combines a modality-aware quantization strategy with gradient-free test-time adaptation. We introduce Selective Hybrid Quantization (SHQ) and a quantized, gradient-free adaptation mechanism to enable robust and efficient VLM deployment on resource-constrained hardware. Experiments across both synthetic and real-world distribution shifts show that LQA improves overall adaptation performance by 4.5\%, uses less memory than full-precision models, and significantly outperforms gradient-based TTA methods, achieving up…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
