AERO: Autonomous Evolutionary Reasoning Optimization via Endogenous Dual-Loop Feedback
Zhitao Gao, Jie Ma, Xuhong Li, Pengyu Li, Ning Qu, Yaqiang Wu, Hui Liu, Jun Liu

TL;DR
AERO is an unsupervised framework that enhances large language model reasoning by internalizing self-questioning, answering, and criticism through a dual-loop system inspired by ZPD, leading to improved performance across multiple benchmarks.
Contribution
It introduces AERO, a novel autonomous reasoning optimization method with entropy-based positioning, counterfactual correction, and staggered training to improve LLM reasoning without external supervision.
Findings
Achieves average performance improvements of 4.57% and 5.10% on two benchmarks.
Outperforms competitive baselines across nine diverse benchmarks.
Demonstrates effective autonomous reasoning evolution in LLMs.
Abstract
Large Language Models (LLMs) have achieved significant success in complex reasoning but remain bottlenecked by reliance on expert-annotated data and external verifiers. While existing self-evolution paradigms aim to bypass these constraints, they often fail to identify the optimal learning zone and risk reinforcing collective hallucinations and incorrect priors through flawed internal feedback. To address these challenges, we propose \underline{A}utonomous \underline{E}volutionary \underline{R}easoning \underline{O}ptimization (AERO), an unsupervised framework that achieves autonomous reasoning evolution by internalizing self-questioning, answering, and criticism within a synergistic dual-loop system. Inspired by the \textit{Zone of Proximal Development (ZPD)} theory, AERO utilizes entropy-based positioning to target the ``solvability gap'' and employs Independent Counterfactual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Machine Learning in Materials Science
