Distributed Multi-Agent Coordination Using Multi-Modal Foundation Models
Saaduddin Mahmud, Dorian Benhamou Goldfajn, Shlomo Zilberstein

TL;DR
This paper presents VL-DCOPs, a novel framework leveraging multimodal foundation models to automate constraint generation in multi-agent coordination, introducing diverse agent archetypes and evaluating their performance on new tasks.
Contribution
It introduces VL-DCOPs, a framework that uses large multimodal models for automatic constraint creation and explores various agent archetypes for solving these problems.
Findings
Neuro-symbolic agents delegate decisions to LFMs.
Fully neural agents rely entirely on LFMs for coordination.
Evaluation on three novel VL-DCOP tasks shows strengths and weaknesses of each archetype.
Abstract
Distributed Constraint Optimization Problems (DCOPs) offer a powerful framework for multi-agent coordination but often rely on labor-intensive, manual problem construction. To address this, we introduce VL-DCOPs, a framework that takes advantage of large multimodal foundation models (LFMs) to automatically generate constraints from both visual and linguistic instructions. We then introduce a spectrum of agent archetypes for solving VL-DCOPs: from a neuro-symbolic agent that delegates some of the algorithmic decisions to an LFM, to a fully neural agent that depends entirely on an LFM for coordination. We evaluate these agent archetypes using state-of-the-art LLMs (large language models) and VLMs (vision language models) on three novel VL-DCOP tasks and compare their respective advantages and drawbacks. Lastly, we discuss how this work extends to broader frontier challenges in the DCOP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Distributed Control Multi-Agent Systems
