AdaTracker: Learning Adaptive In-Context Policy for Cross-Embodiment Active Visual Tracking

Kui Wu; Hao Chen; Jinzhu Han; Haijun Liu; Churan Wang; Yizhou Wang; Zhoujun Li; Si Liu; Fangwei Zhong

arXiv:2604.20305·cs.RO·April 23, 2026

AdaTracker: Learning Adaptive In-Context Policy for Cross-Embodiment Active Visual Tracking

Kui Wu, Hao Chen, Jinzhu Han, Haijun Liu, Churan Wang, Yizhou Wang, Zhoujun Li, Si Liu, Fangwei Zhong

PDF

TL;DR

AdaTracker is a novel adaptive policy framework that enables cross-embodiment active visual tracking by modeling embodiment-specific constraints and dynamically adapting to unseen robot morphologies.

Contribution

It introduces an Embodiment Context Encoder and a context-aware policy for zero-shot generalization across diverse robot platforms.

Findings

01

Outperforms state-of-the-art in cross-embodiment generalization

02

Demonstrates effective zero-shot adaptation in real-world experiments

03

Improves sample efficiency in diverse robotic scenarios

Abstract

Realizing active visual tracking with a single unified model across diverse robots is challenging, as the physical constraints and motion dynamics vary drastically from one platform to another. Existing approaches typically train separate models for each embodiment, leading to poor scalability and limited generalization. To address this, we propose AdaTracker, an adaptive in-context policy learning framework that robustly tracks targets on diverse robot morphologies. Our key insight is to explicitly model embodiment-specific constraints through an Embodiment Context Encoder, which infers embodiment-specific constraints from history. This contextual representation dynamically modulates a Context-Aware Policy, enabling it to infer optimal control actions for unseen embodiments in a zero-shot manner. To enhance robustness, we introduce two auxiliary objectives to ensure accurate context…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.