AERO: Autonomous Evolutionary Reasoning Optimization via Endogenous Dual-Loop Feedback

Zhitao Gao; Jie Ma; Xuhong Li; Pengyu Li; Ning Qu; Yaqiang Wu; Hui Liu; Jun Liu

arXiv:2602.03084·cs.CL·February 5, 2026

AERO: Autonomous Evolutionary Reasoning Optimization via Endogenous Dual-Loop Feedback

Zhitao Gao, Jie Ma, Xuhong Li, Pengyu Li, Ning Qu, Yaqiang Wu, Hui Liu, Jun Liu

PDF

Open Access

TL;DR

AERO is an unsupervised framework that enhances large language model reasoning by internalizing self-questioning, answering, and criticism through a dual-loop system inspired by ZPD, leading to improved performance across multiple benchmarks.

Contribution

It introduces AERO, a novel autonomous reasoning optimization method with entropy-based positioning, counterfactual correction, and staggered training to improve LLM reasoning without external supervision.

Findings

01

Achieves average performance improvements of 4.57% and 5.10% on two benchmarks.

02

Outperforms competitive baselines across nine diverse benchmarks.

03

Demonstrates effective autonomous reasoning evolution in LLMs.

Abstract

Large Language Models (LLMs) have achieved significant success in complex reasoning but remain bottlenecked by reliance on expert-annotated data and external verifiers. While existing self-evolution paradigms aim to bypass these constraints, they often fail to identify the optimal learning zone and risk reinforcing collective hallucinations and incorrect priors through flawed internal feedback. To address these challenges, we propose \underline{A}utonomous \underline{E}volutionary \underline{R}easoning \underline{O}ptimization (AERO), an unsupervised framework that achieves autonomous reasoning evolution by internalizing self-questioning, answering, and criticism within a synergistic dual-loop system. Inspired by the \textit{Zone of Proximal Development (ZPD)} theory, AERO utilizes entropy-based positioning to target the ``solvability gap'' and employs Independent Counterfactual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Machine Learning in Materials Science