Training-Free Agentic AI: Probabilistic Control and Coordination in Multi-Agent LLM Systems
Mohammad Parsa Hosseini, Ankit Shah, Saiyra Qureshi, Alex Huang, Connie Miao, Wei Wei

TL;DR
This paper presents REDEREF, a training-free probabilistic controller for multi-agent LLM systems that enhances routing efficiency, robustness, and reduces resource usage through belief-guided delegation and reflection-driven re-routing.
Contribution
The paper introduces REDEREF, a novel, training-free control method that improves multi-agent LLM collaboration by integrating probabilistic routing, reflection, and memory-aware priors.
Findings
Reduces token usage by 28%
Decreases agent calls by 17%
Cuts time-to-success by 19%
Abstract
Multi-agent large language model (LLM) systems enable complex, long-horizon reasoning by composing specialized agents, but practical deployment remains hindered by inefficient routing, noisy feedback, and high interaction cost. We introduce REDEREF, a lightweight and training-free controller for multi-agent LLM collaboration that improves routing efficiency during recursive delegation. REDEREF integrates (i) belief-guided delegation via Thompson sampling to prioritize agents with historically positive marginal contributions, (ii) reflection-driven re-routing using a calibrated LLM or programmatic judge, (iii) evidence-based selection rather than output averaging, and (iv) memory-aware priors to reduce cold-start inefficiency. Across multi-agent split-knowledge tasks, we show that while recursive retry alone saturates task success, belief-guided routing reduces token usage by 28%, agent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Speech and dialogue systems
