TIC-VLA: A Think-in-Control Vision-Language-Action Model for Robot Navigation in Dynamic Environments

Zhiyu Huang; Yun Zhang; Johnson Liu; Rui Song; Chen Tang; Jiaqi Ma

arXiv:2602.02459·cs.RO·February 3, 2026

TIC-VLA: A Think-in-Control Vision-Language-Action Model for Robot Navigation in Dynamic Environments

Zhiyu Huang, Yun Zhang, Johnson Liu, Rui Song, Chen Tang, Jiaqi Ma

PDF

Open Access

TL;DR

TIC-VLA introduces a latency-aware vision-language-action framework for robot navigation that models delayed semantic reasoning, improving real-time control in dynamic environments.

Contribution

It proposes a novel latency-aware model and training pipeline that explicitly handle reasoning delays, enhancing robot navigation performance in dynamic settings.

Findings

01

Outperforms prior VLA models in simulation and real-world tests.

02

Maintains robust real-time control despite multi-second reasoning latency.

03

Introduces DynaNav, a realistic simulation environment for evaluation.

Abstract

Robots in dynamic, human-centric environments must follow language instructions while maintaining real-time reactive control. Vision-language-action (VLA) models offer a promising framework, but they assume temporally aligned reasoning and control, despite semantic inference being inherently delayed relative to real-time action. We introduce Think-in-Control (TIC)-VLA, a latency-aware framework that explicitly models delayed semantic reasoning during action generation. TIC-VLA defines a delayed semantic-control interface that conditions action generation on delayed vision-language semantic states and explicit latency metadata, in addition to current observations, enabling policies to compensate for asynchronous reasoning. We further propose a latency-consistent training pipeline that injects reasoning inference delays during imitation learning and online reinforcement learning, aligning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Robot Manipulation and Learning