What Makes Reasoning Invalid: Echo Reflection Mitigation for Large Language Models
Chen He, Xun Jiang, Lei Wang, Hao Yang, Chong Peng, Peng Yan, Fumin Shen, Xing Xu

TL;DR
This paper identifies the 'Echo Reflection' problem in large language models during complex reasoning tasks and introduces AEPO, a reinforcement learning method, to mitigate this issue and improve reasoning performance.
Contribution
The paper uncovers the causes of Echo Reflection in LLMs and proposes AEPO, a novel reinforcement learning framework with reflection-aware filtration and adaptive entropy optimization.
Findings
AEPO outperforms existing reinforcement learning methods on multiple benchmarks.
Echo Reflection causes LLMs to reiterate without generating new insights.
AEPO effectively reduces echo reflection and enhances reasoning accuracy.
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of reasoning tasks. Recent methods have further improved LLM performance in complex mathematical reasoning. However, when extending these methods beyond the domain of mathematical reasoning to tasks involving complex domain-specific knowledge, we observe a consistent failure of LLMs to generate novel insights during the reflection stage. Instead of conducting genuine cognitive refinement, the model tends to mechanically reiterate earlier reasoning steps without introducing new information or perspectives, a phenomenon referred to as "Echo Reflection". We attribute this behavior to two key defects: (1) Uncontrollable information flow during response generation, which allows premature intermediate thoughts to propagate unchecked and distort final decisions; (2) Insufficient exploration of internal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Advanced Graph Neural Networks
