TL;DR
This paper presents Nexus Architect, a multi-agent system with automated workflow generation that enhances reasoning and generalization in language models, significantly outperforming existing solutions on logical reasoning tasks.
Contribution
Introduction of Nexus Architect, an automated workflow synthesis mechanism for multi-agent reasoning systems that improves generalization and performance of language models.
Findings
Achieves up to 66% higher pass rate than Gemini 2.5
Nearly 2.5 times better than Claude Sonnet 4 and DeepSeek-R1
Over 3 times more effective than Llama 4 Scout
Abstract
The rise of Large Reasoning Models (LRMs) promises a significant leap forward in language model capabilities, aiming to tackle increasingly sophisticated tasks with unprecedented efficiency and accuracy. However, despite their impressive performance, recent studies have highlighted how current reasoning models frequently fail to generalize to novel, unseen problems, often resorting to memorized solutions rather than genuine inferential reasoning. Such behavior underscores a critical limitation in modern LRMs, i.e., their tendency toward overfitting, which in turn results in poor generalization in problem-solving capabilities. In this paper, we introduce Nexus Architect, an enhanced iteration of our multi-agent system framework, Nexus, equipped with a novel automated workflow synthesis mechanism. Given a user's prompt and a small set of representative examples, the Architect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
