Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research

Martin Legrand; Tao Jiang; Matthieu Feraud; Benjamin Navet; Yousouf Taghzouti; Fabien Gandon; Elise Dumont; Louis-F\'elix Nothias

arXiv:2603.28986·cs.AI·April 1, 2026

Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research

Martin Legrand, Tao Jiang, Matthieu Feraud, Benjamin Navet, Yousouf Taghzouti, Fabien Gandon, Elise Dumont, Louis-F\'elix Nothias

PDF

TL;DR

Mimosa is an open-source, evolving multi-agent framework that automatically creates and refines scientific workflows using feedback, enabling adaptable autonomous research systems.

Contribution

It introduces a novel architecture combining dynamic tool discovery, workflow synthesis, and iterative refinement driven by experimental feedback.

Findings

01

Mimosa achieves a 43.1% success rate on ScienceAgentBench with DeepSeek-V3.2.

02

The framework outperforms static multi-agent and single-agent baselines.

03

Model responses to multi-agent decomposition vary, affecting workflow benefits.

Abstract

Current Autonomous Scientific Research (ASR) systems, despite leveraging large language models (LLMs) and agentic architectures, remain constrained by fixed workflows and toolsets that prevent adaptation to evolving tasks and environments. We introduce Mimosa, an evolving multi-agent framework that automatically synthesizes task-specific multi-agent workflows and iteratively refines them through experimental feedback. Mimosa leverages the Model Context Protocol (MCP) for dynamic tool discovery, generates workflow topologies via a meta-orchestrator, executes subtasks through code-generating agents that invoke available tools and scientific software libraries, and scores executions with an LLM-based judge whose feedback drives workflow refinement. On ScienceAgentBench, Mimosa achieves a success rate of 43.1% with DeepSeek-V3.2, surpassing both single-agent baselines and static multi-agent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.