LiteVLA-Edge: Quantized On-Device Multimodal Control for Embedded Robotics

Justin Williams; Kishor Datta Gupta; Roy George; Mrinmoy Sarkar

arXiv:2603.03380·cs.RO·March 5, 2026

LiteVLA-Edge: Quantized On-Device Multimodal Control for Embedded Robotics

Justin Williams, Kishor Datta Gupta, Roy George, Mrinmoy Sarkar

PDF

Open Access

TL;DR

LiteVLA-Edge is a practical system enabling real-time, fully on-device vision-language-action processing for embedded robots, combining quantization and GPU acceleration to achieve low latency.

Contribution

It introduces a deployment-oriented pipeline for running compact multimodal control models locally on embedded hardware with preserved modular interfaces.

Findings

01

Achieves 150.5 ms latency (6.6 Hz) on Jetson Orin hardware.

02

Operates entirely offline within a ROS 2 pipeline.

03

Provides a reproducible baseline for on-device VLA in robotics.

Abstract

Vision-Language-Action (VLA) models provide a unified framework for perception, language conditioning, and action generation, but many existing systems remain difficult to deploy in embedded robotic settings because of their computational requirements and inference latency. In this paper, we present LiteVLA-Edge, a deployment-oriented VLA pipeline for fully on-device inference on Jetson Orin-class hardware. Our approach combines supervised image-to-action fine-tuning in FP32 with post-training 4-bit GGUF quantization and GPU-accelerated inference through the \texttt{llama.cpp} runtime. Under our deployment configuration, LiteVLA-Edge achieves a mean end-to-end latency of 150.5\,ms (approximately 6.6\,Hz) while operating entirely offline within a ROS~2-integrated perception--reasoning--action pipeline. Rather than introducing a new policy objective, our contribution is a practical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Robot Manipulation and Learning