MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

Giridhar Ganapavarapu; Dhaval Patel

arXiv:2605.09131·cs.AI·May 12, 2026

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

Giridhar Ganapavarapu, Dhaval Patel

PDF

TL;DR

MCP-Cosmos introduces a framework integrating world models into MCP-based agents, enabling predictive planning and improved task execution in complex environments through simulation and refinement.

Contribution

It presents a novel 'Bring Your Own World Model' approach that unifies MCP, world models, and agents for enhanced long-horizon planning and environment interaction.

Findings

01

Improved tool success rate and parameter accuracy in experiments.

02

Demonstrated effectiveness of world models in predictive task automation.

03

Introduced new metrics like Execution Quality for evaluating world model performance.

Abstract

The Model Context Protocol (MCP) has unified the interface between Large Language Models (LLMs) and external tools, yet a fundamental gap remains in how agents conceptualize the environments within which they operate. Current paradigms are bifurcated: Task-level planning often ignores execution-time dynamics, while reactive execution lacks long-horizon foresight. We present MCP-Cosmos, a framework that infuses generative World Models (WM) into the MCP ecosystem to enable predictive task automation. By unifying three disparate technologies, namely MCP, World Model, and Agent, we demonstrate that a "Bring Your Own World Model" (BYOWM) strategy allows agents to simulate state transitions and refine plans in a latent space before execution. We conducted experiments using two strategies, namely ReAct and SPIRAL with 2 planning models and 3 representative world models over 20+ MCP-Bench…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.