CoLF: Learning Consistent Leader-Follower Policies for Vision-Language-Guided Multi-Robot Cooperative Transport

Joachim Yann Despature; Kazuki Shibata; Takamitsu Matsubara

arXiv:2602.07776·cs.RO·February 10, 2026

CoLF: Learning Consistent Leader-Follower Policies for Vision-Language-Guided Multi-Robot Cooperative Transport

Joachim Yann Despature, Kazuki Shibata, Takamitsu Matsubara

PDF

Open Access

TL;DR

This paper introduces CoLF, a reinforcement learning framework that enables multi-robot teams to perform vision-language-guided cooperative transport with stable leader-follower roles, addressing perceptual misalignment issues.

Contribution

We propose a novel asymmetric policy and mutual-information-based training to achieve consistent leader-follower roles in multi-robot cooperation.

Findings

01

CoLF improves role stability and task success in simulation.

02

CoLF demonstrates effective real-robot cooperative transport.

03

The framework enhances robustness against perceptual misalignment.

Abstract

In this study, we address vision-language-guided multi-robot cooperative transport, where each robot grounds natural-language instructions from onboard camera observations. A key challenge in this decentralized setting is perceptual misalignment across robots, where viewpoint differences and language ambiguity can yield inconsistent interpretations and degrade cooperative transport. To mitigate this problem, we adopt a dependent leader-follower design, where one robot serves as the leader and the other as the follower. Although such a leader-follower structure appears straightforward, learning with independent and symmetric agents often yields symmetric or unstable behaviors without explicit inductive biases. To address this challenge, we propose Consistent Leader-Follower (CoLF), a multi-agent reinforcement learning (MARL) framework for stable leader-follower role differentiation. CoLF…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Social Robot Interaction and HRI