Latent Theory of Mind: A Decentralized Diffusion Architecture for Cooperative Manipulation

Chengyang He; Gadiel Sznaier Camps; Xu Liu; Mac Schwager; Guillaume Sartoretti

arXiv:2505.09144·cs.RO·May 15, 2025

Latent Theory of Mind: A Decentralized Diffusion Architecture for Cooperative Manipulation

Chengyang He, Gadiel Sznaier Camps, Xu Liu, Mac Schwager, Guillaume Sartoretti

PDF

Open Access

TL;DR

LatentToM introduces a decentralized diffusion policy architecture enabling cooperative robot manipulation through shared latent representations, allowing collaboration with or without explicit communication, and demonstrating superior hardware performance.

Contribution

The paper proposes a novel decentralized diffusion architecture with dual latent embeddings and sheaf theory-based training, enhancing cooperative manipulation without explicit communication.

Findings

01

Outperforms naive decentralized baselines in hardware experiments.

02

Achieves comparable performance to centralized policies in bi-manual tasks.

03

Enables fully distributed execution with implicit information exchange.

Abstract

We present Latent Theory of Mind (LatentToM), a decentralized diffusion policy architecture for collaborative robot manipulation. Our policy allows multiple manipulators with their own perception and computation to collaborate with each other towards a common task goal with or without explicit communication. Our key innovation lies in allowing each agent to maintain two latent representations: an ego embedding specific to the robot, and a consensus embedding trained to be common to both robots, despite their different sensor streams and poses. We further let each robot train a decoder to infer the other robot's ego embedding from their consensus embedding, akin to theory of mind in latent space. Training occurs centrally, with all the policies' consensus encoders supervised by a loss inspired by sheaf theory, a mathematical theory for clustering data on a topological manifold.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Cognitive Science and Education Research · Reinforcement Learning in Robotics