Language-Based Swarm Perception: Decentralized Person Re-Identification via Natural Language Descriptions
Miquel Kegeleirs, Lorenzo Garattoni, Gianpiero Francesca, and Mauro Birattari

TL;DR
This paper presents a decentralized person re-identification method for robot swarms that uses natural language descriptions instead of visual embeddings, enabling interpretable and collaborative identification.
Contribution
The paper introduces a novel language-based perception approach for swarm robots, replacing traditional visual embeddings with human-readable descriptions for decentralized person re-identification.
Findings
Competitive identity consistency performance
Enhanced interpretability of swarm perception
Potential for natural-language querying and transparency
Abstract
We introduce a method for decentralized person re-identification in robot swarms that leverages natural language as the primary representational modality. Unlike traditional approaches that rely on opaque visual embeddings -- high-dimensional feature vectors extracted from images -- the proposed method uses human-readable language to represent observations. Each robot locally detects and describes individuals using a vision-language model (VLM), producing textual descriptions of appearance instead of feature vectors. These descriptions are compared and clustered across the swarm without centralized coordination, allowing robots to collaboratively group observations of the same individual. Each cluster is distilled into a representative description by a language model, providing an interpretable, concise summary of the swarm's collective perception. This approach enables natural-language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Multimodal Machine Learning Applications · Social Robot Interaction and HRI
