ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge

Yuntao Dai; Hang Gu; Teng Wang; Qianyu Cheng; Yifei Zheng; Zhiyong Qiu; Lei Gong; Wenqi Lou; Xuehai Zhou

arXiv:2512.20276·cs.AI·December 24, 2025

ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge

Yuntao Dai, Hang Gu, Teng Wang, Qianyu Cheng, Yifei Zheng, Zhiyong Qiu, Lei Gong, Wenqi Lou, Xuehai Zhou

PDF

Open Access

TL;DR

ActionFlow is a system-level framework that significantly accelerates vision-language-action model inference on edge devices, enabling real-time robotic interaction by optimizing memory and compute scheduling without retraining.

Contribution

It introduces a novel pipelined scheduling strategy and memory optimization techniques to boost inference speed of VLA models on resource-constrained edge hardware.

Findings

01

Achieves 2.55x FPS improvement on OpenVLA-7B model.

02

Enables real-time dynamic manipulation on edge hardware.

03

Operates without retraining the original models.

Abstract

Vision-Language-Action (VLA) models have emerged as a unified paradigm for robotic perception and control, enabling emergent generalization and long-horizon task execution. However, their deployment in dynamic, real-world environments is severely hin dered by high inference latency. While smooth robotic interaction requires control frequencies of 20 to 30 Hz, current VLA models typi cally operate at only 3-5 Hz on edge devices due to the memory bound nature of autoregressive decoding. Existing optimizations often require extensive retraining or compromise model accuracy. To bridge this gap, we introduce ActionFlow, a system-level inference framework tailored for resource-constrained edge plat forms. At the core of ActionFlow is a Cross-Request Pipelin ing strategy, a novel scheduler that redefines VLA inference as a macro-pipeline of micro-requests. The strategy intelligently batches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Ferroelectric and Negative Capacitance Devices