MOSAIC: A Unified Platform for Cross-Paradigm Comparison and Evaluation of Homogeneous and Heterogeneous Multi-Agent RL, LLM, VLM, and Human Decision-Makers

Abdulhamid M. Mousa; Yu Fu; Rakhmonberdi Khajiev; Jalaledin M. Azzabi; Abdulkarim M. Mousa; Peng Yang; Yunusa Haruna; and Ming Liu

arXiv:2603.01260·cs.LG·March 3, 2026

MOSAIC: A Unified Platform for Cross-Paradigm Comparison and Evaluation of Homogeneous and Heterogeneous Multi-Agent RL, LLM, VLM, and Human Decision-Makers

Abdulhamid M. Mousa, Yu Fu, Rakhmonberdi Khajiev, Jalaledin M. Azzabi, Abdulkarim M. Mousa, Peng Yang, Yunusa Haruna, and Ming Liu

PDF

Open Access

TL;DR

MOSAIC is an open-source platform that enables the deployment and comparison of heterogeneous decision-making agents, including RL, LLMs, VLMs, and humans, within shared environments for reproducible research.

Contribution

It introduces a unified framework with IPC-based worker protocols, an agent abstraction layer, and a deterministic evaluation system for cross-paradigm agent comparison.

Findings

01

Enables fair, reproducible comparison of diverse agents in shared environments.

02

Supports hybrid multi-agent settings with minimal modifications to existing frameworks.

03

Provides tools for detailed behavioral analysis through visual and automated evaluation modes.

Abstract

Reinforcement learning (RL), large language models (LLMs), and vision-language models (VLMs) have been widely studied in isolation. However, existing infrastructure lacks the ability to deploy agents from different decision-making paradigms within the same environment, making it difficult to study them in hybrid multi-agent settings or to compare their behaviour fairly under identical conditions. We present MOSAIC, an open-source platform that bridges this gap by incorporating a diverse set of existing reinforcement learning environments and enabling heterogeneous agents (RL policies, LLMs, VLMs, and human players) to operate within them in ad-hoc team settings with reproducible results. MOSAIC introduces three contributions. (i) An IPC-based worker protocol that wraps both native and third-party frameworks as isolated subprocess workers, each executing its native training and inference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)