Characterizing, Evaluating, and Optimizing Complex Reasoning

Haoran Zhang; Yafu Li; Zhi Wang; Zhilin Wang; Shunkai Zhang; Xiaoye Qu; Yu Cheng

arXiv:2602.08498·cs.CL·February 10, 2026

Characterizing, Evaluating, and Optimizing Complex Reasoning

Haoran Zhang, Yafu Li, Zhi Wang, Zhilin Wang, Shunkai Zhang, Xiaoye Qu, Yu Cheng

PDF

Open Access

TL;DR

This paper introduces a unified framework for defining, evaluating, and optimizing complex reasoning in large models, using DAG-based evaluation and a learned reward model to improve reasoning quality and task performance.

Contribution

It proposes the ME$^2$ principle for reasoning quality, models reasoning traces as DAGs, and develops a TRM-Preference dataset and Thinking Reward Model for scalable evaluation and optimization.

Findings

01

Thinking rewards improve reasoning outcomes by up to 19.3%.

02

Better reasoning selection enhances performance by up to 3.9%.

03

DAG-based evaluation effectively captures complex reasoning structures.

Abstract

Large Reasoning Models (LRMs) increasingly rely on reasoning traces with complex internal structures. However, existing work lacks a unified answer to three fundamental questions: (1) what defines high-quality reasoning, (2) how to reliably evaluate long, implicitly structured reasoning traces, and (3) how to use such evaluation signals for reasoning optimization. To address these challenges, we provide a unified perspective. (1) We introduce the ME $^{2}$ principle to characterize reasoning quality along macro- and micro-level concerning efficiency and effectiveness. (2) Built on this principle, we model reasoning traces as directed acyclic graphs (DAGs) and develop a DAG-based pairwise evaluation method, capturing complex reasoning structures. (3) Based on this method, we construct the TRM-Preference dataset and train a Thinking Reward Model (TRM) to evaluate reasoning quality at scale.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Explainable Artificial Intelligence (XAI) · Advanced Graph Neural Networks