Group Pattern Selection Optimization: Let LRMs Pick the Right Pattern for Reasoning

Hanbin Wang; Jingwei Song; Jinpeng Li; Fei Mi; Lifeng Shang

arXiv:2601.07238·cs.AI·January 13, 2026

Group Pattern Selection Optimization: Let LRMs Pick the Right Pattern for Reasoning

Hanbin Wang, Jingwei Song, Jinpeng Li, Fei Mi, Lifeng Shang

PDF

Open Access

TL;DR

This paper introduces GPSO, a reinforcement learning framework that enables large reasoning models to select the most effective reasoning pattern for each problem, significantly improving their accuracy and robustness.

Contribution

The paper proposes GPSO, a novel method that allows models to dynamically choose optimal reasoning strategies, addressing biases and sub-optimal patterns in existing training methods.

Findings

01

GPSO improves performance across multiple benchmarks.

02

Models with GPSO show reduced pattern bias and increased adaptability.

03

Significant accuracy gains observed with various model sizes.

Abstract

Large reasoning models (LRMs) exhibit diverse high-level reasoning patterns (e.g., direct solution, reflection-and-verification, and exploring multiple solutions), yet prevailing training recipes implicitly bias models toward a limited set of dominant patterns. Through a systematic analysis, we identify substantial accuracy variance across these patterns on mathematics and science benchmarks, revealing that a model's default reasoning pattern is often sub-optimal for a given problem. To address this, we introduce Group Pattern Selection Optimization (GPSO), a reinforcement learning framework that extends GRPO by incorporating multi-pattern rollouts, verifier-guided optimal pattern selection per problem, and attention masking during optimization to prevent the leakage of explicit pattern suffixes into the learned policy. By exploring a portfolio of diverse reasoning strategies and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsConstraint Satisfaction and Optimization · Advanced Graph Neural Networks · Topic Modeling