From Thinking to Output: Chain-of-Thought and Text Generation Characteristics in Reasoning Language Models

Junhao Liu; Zhenhao Xu; Yuxin Fang; Yichuan Chen; Zuobin Ying; Wenhan Chang

arXiv:2506.21609·cs.CL·June 30, 2025

From Thinking to Output: Chain-of-Thought and Text Generation Characteristics in Reasoning Language Models

Junhao Liu, Zhenhao Xu, Yuxin Fang, Yichuan Chen, Zuobin Ying, Wenhan Chang

PDF

Open Access

TL;DR

This paper systematically analyzes reasoning processes of four large language models, revealing patterns in their thinking, output coherence, and trade-offs between efficiency and robustness, through a novel framework and diverse dataset.

Contribution

It introduces a new framework for analyzing reasoning characteristics of large models, connecting internal thinking with outputs, and provides practical insights for model improvement.

Findings

01

Models differ in reasoning depth and intermediate step reliance.

02

Patterns of exploration and exploitation vary across models.

03

Insights into balancing efficiency and reasoning robustness.

Abstract

Recently, there have been notable advancements in large language models (LLMs), demonstrating their growing abilities in complex reasoning. However, existing research largely overlooks a thorough and systematic comparison of these models' reasoning processes and outputs, particularly regarding their self-reflection pattern (also termed "Aha moment") and the interconnections across diverse domains. This paper proposes a novel framework for analyzing the reasoning characteristics of four cutting-edge large reasoning models (GPT-o1, DeepSeek-R1, Kimi-k1.5, and Grok-3) using keywords statistic and LLM-as-a-judge paradigm. Our approach connects their internal thinking processes with their final outputs. A diverse dataset consists of real-world scenario-based questions covering logical deduction, causal inference, and multi-step problem-solving. Additionally, a set of metrics is put forward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)