Shallow-{\pi}: Knowledge Distillation for Flow-based VLAs

Boseong Jeon; Yunho Choi; Taehan Kim

arXiv:2601.20262·cs.RO·January 29, 2026

Shallow-{\pi}: Knowledge Distillation for Flow-based VLAs

Boseong Jeon, Yunho Choi, Taehan Kim

PDF

Open Access

TL;DR

Shallow-pi is a knowledge distillation framework that significantly reduces the depth of flow-based vision-language-action models, enabling faster inference with minimal accuracy loss, validated on real-world robotic platforms.

Contribution

It introduces a novel transformer layer reduction method for flow-based VLA models, achieving high compression and efficiency while maintaining performance.

Findings

01

Over two times faster inference speed.

02

Less than 1% success rate drop on benchmarks.

03

Validated on multiple real-world robotic systems.

Abstract

The growing demand for real-time robotic deployment necessitates fast and on-device inference for vision-language-action (VLA) models. Within the VLA literature, efficiency has been extensively studied at the token level, such as visual token pruning. In contrast, systematic transformer layer reduction has received limited attention and, to the best of our knowledge, has not been explored for flow-based VLA models under knowledge distillation. In this work, we propose Shallow-pi, a principled knowledge distillation framework that aggressively reduces the transformer depth of both the VLM backbone and the flow-based action head, compressing the model from 18 to 6 layers. Shallow-pi achieves over two times faster inference with less than one percent absolute drop in success rate on standard manipulation benchmarks, establishing state-of-the-art performance among reduced VLA models.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Robot Manipulation and Learning · Advanced Neural Network Applications