Data Diversification Methods In Alignment Enhance Math Performance In LLMs

Berkan Dokmeci; Qingyang Wu; Ben Athiwaratkun; Ce Zhang; Shuaiwen Leon Song; James Zou

arXiv:2507.02173·cs.AI·July 4, 2025

Data Diversification Methods In Alignment Enhance Math Performance In LLMs

Berkan Dokmeci, Qingyang Wu, Ben Athiwaratkun, Ce Zhang, Shuaiwen Leon Song, James Zou

PDF

TL;DR

This paper demonstrates that data diversification strategies, especially the novel Diversified-ThinkSolve method, significantly improve the mathematical reasoning abilities of large language models with minimal additional computational cost.

Contribution

The paper introduces Diversified-ThinkSolve, a new structured approach for data diversification that enhances mathematical reasoning in LLMs more effectively than traditional methods.

Findings

01

DTS improves GSM8K accuracy by 7.1% and MATH by 4.2%.

02

DTS incurs only 1.03x computational overhead.

03

MCTS is more costly with less performance gain.

Abstract

While recent advances in preference learning have enhanced alignment in human feedback, mathematical reasoning remains a persistent challenge. We investigate how data diversification strategies in preference optimization can improve the mathematical reasoning abilities of large language models (LLMs). We evaluate three common data generation methods: temperature sampling, Chain-of-Thought prompting, and Monte Carlo Tree Search (MCTS), and introduce Diversified-ThinkSolve (DTS), a novel structured approach that systematically decomposes problems into diverse reasoning paths. Our results show that with strategically diversified preference data, models can substantially improve mathematical reasoning performance, with the best approach yielding gains of 7.1% on GSM8K and 4.2% on MATH over the base model. Despite its strong performance, DTS incurs only a marginal computational overhead…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.