Team of Thoughts: Efficient Test-time Scaling of Agentic Systems through Orchestrated Tool Calling
Jeffrey T. H. Wong, Zixi Zhang, Junyi Liu, Yiren Zhao

TL;DR
Team-of-Thoughts introduces a heterogeneous multi-agent framework that dynamically orchestrates specialized models during inference, significantly enhancing performance on reasoning and code generation tasks.
Contribution
It presents a novel orchestrator-driven MAS framework with self-assessment and calibration components, enabling effective utilization of diverse pre-trained models.
Findings
Achieves 96.00% accuracy on AIME24 benchmark.
Attains 77.91% accuracy on LiveCodeBench.
Outperforms homogeneous baselines and existing MAS methods.
Abstract
Existing Multi-Agent Systems (MAS) typically rely on homogeneous model configurations, failing to exploit the diverse expertise inherent in different post-trained architectures. We propose Team-of-Thoughts, a heterogeneous MAS framework that treats diverse models as specialized tools within an orchestrator-driven paradigm. Team-of-Thoughts introduces two novel components: (1) Orchestrator Calibration, which identifies models with superior coordination and synthesis capabilities, and (2) Agent Self-Assessment, a protocol where tool agents profile their own domain-specific strengths to guide selection. At inference, the orchestrator dynamically activates the most compatible agents based on these profiles to maximize capability coverage. Across five mathematical reasoning and code generation benchmarks, Team-of-Thoughts consistently outperforms individual models and existing MAS baselines.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Advanced Software Engineering Methodologies · Multimodal Machine Learning Applications
