Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization

Ying Zhu; Heng Zhou; Rui Su; Peiqin Zhuang; Lei Bai

arXiv:2505.17115·cs.MA·June 2, 2025

Swarm Intelligence Enhanced Reasoning: A Density-Driven Framework for LLM-Based Multi-Agent Optimization

Ying Zhu, Heng Zhou, Rui Su, Peiqin Zhuang, Lei Bai

PDF

3 Reviews

TL;DR

This paper introduces a novel framework that integrates swarm intelligence with large language models to improve multi-agent reasoning by optimizing solution quality and diversity through density-driven strategies.

Contribution

It proposes the Agent-based Swarm Intelligence paradigm and the SIER framework, combining kernel density estimation and non-dominated sorting to enhance reasoning in LLMs.

Findings

01

Improved solution quality and diversity in reasoning tasks.

02

Enhanced ability to escape local optima during problem-solving.

03

Demonstrated effectiveness on complex reasoning benchmarks.

Abstract

Recently, many approaches, such as Chain-of-Thought (CoT) prompting and Multi-Agent Debate (MAD), have been proposed to further enrich Large Language Models' (LLMs) complex problem-solving capacities in reasoning scenarios. However, these methods may fail to solve complex problems due to the lack of ability to find optimal solutions. Swarm Intelligence has been serving as a powerful tool for finding optima in the field of traditional optimization problems. To this end, we propose integrating swarm intelligence into the reasoning process by introducing a novel Agent-based Swarm Intelligence (ASI) paradigm. In this paradigm, we formulate LLM reasoning as an optimization problem and use a swarm intelligence scheme to guide a group of LLM-based agents in collaboratively searching for optimal solutions. To avoid swarm intelligence getting trapped in local optima, we further develop a Swarm…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 2Confidence 4

Strengths

The paper demonstrates sound methodology through its novel SIER approach, which advances test-time scaling by introducing sample diversity mechanisms and multi-dimensional evaluation criteria beyond traditional methods like MAD and CoT. The authors provide convincing initial evidence of SIER's superiority in mathematical reasoning tasks, supported by a clearly articulated algorithm that details how diverse sampling and comprehensive evaluation work together to improve reasoning outputs. While th

Weaknesses

I believe that SIER method has a strong potential, and it was compelling to see that SIER had superior performance across several mathematical reasoning benchmarks. But there is not enough evidence to make strong claims yet. The only policy-reward model combination evaluated was Qwen2.5-7B-instruct with Qwen2.5-Math-PRM-72B. It's also unclear unclear what models were used for the RGS and CoT methods used to compare against SIER (Table 1). From the wording of the paper I'm assuming it was done wi

Reviewer 02Rating 4Confidence 3

Strengths

1. The use of kernel density estimation and non-dominated sorting ensures that the exploration of the solution space is both diverse and of high quality, avoiding the pitfalls of convergence to local optima. 2. The framework is extensively tested on challenging benchmarks like AIME, MATH-500, and GSM8K, with significant improvements over traditional methods, particularly for more difficult problems. 3. The dynamic control of the exploration process through quality thresholds and flexible termina

Weaknesses

1. The framework requires higher computational resources, especially when dealing with more complex problems (e.g., MATH-500), as it involves more extensive exploration of the solution space. The increased token usage could be a limitation for large-scale applications. 2. The effectiveness of the framework relies heavily on the quality of the Process Reward Model (PRM). If the evaluator is biased or inaccurate, it may still lead to suboptimal solutions, especially in cases where the PRM is unabl

Reviewer 03Rating 2Confidence 3

Strengths

The idea seems novel: reframe LLM reasoning as a swarm intelligence-type optimization problem, then use the methods available for that kind of problem. Kernel densitry estimation is a powerful way to balance exploration vs exploitation in a clear manner, which is a major weakness for other more experimental approaches. Their methodology is well-edfined and powerful, with a lot of mathematical grounding. They also used a good number of reasoning benchmarks to evaluate on, including ones that are

Weaknesses

The most significant drawback is the computational inefficiency. I 5x token cost on complex datasets is, unfortunately, outweighing the contribution this paper would otherwise be. Scalability and practical deployment cost is just not feasible with such a 5x factor, or at least, it would need to be more strongly argued for. Likewise, it would have to be shown whether such a factor is limited to math reasoning, and how well (and with what inefficiency factor) the framework works on more general s

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.