DART-Math: Difficulty-Aware Rejection Tuning for Mathematical   Problem-Solving

Yuxuan Tong; Xiwen Zhang; Rui Wang; Ruidong Wu; Junxian He

arXiv:2407.13690·cs.CL·December 24, 2024

DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving

Yuxuan Tong, Xiwen Zhang, Rui Wang, Ruidong Wu, Junxian He

PDF

Open Access 1 Repo 10 Models 5 Datasets 1 Video

TL;DR

DART-Math introduces a difficulty-aware rejection tuning method that enhances mathematical problem-solving models by focusing on challenging queries, resulting in superior performance using smaller, publicly available datasets and models.

Contribution

The paper presents DART, a novel method that prioritizes difficult queries during data synthesis, creating smaller, more effective datasets for training mathematical reasoning models without proprietary data.

Findings

01

DART-Math outperforms previous methods on 6 benchmarks.

02

Models trained with DART datasets outperform those trained with larger, less focused datasets.

03

Synthetic datasets created with DART are the most cost-effective publicly available resources.

Abstract

Solving mathematical problems requires advanced reasoning abilities and presents notable challenges for large language models. Previous works usually synthesize data from proprietary models to augment existing datasets, followed by instruction tuning to achieve top-tier results. However, our analysis of these datasets reveals severe biases towards easy queries, with frequent failures to generate any correct response for the most challenging queries. Hypothesizing that difficult queries are crucial to learn complex reasoning, we propose Difficulty-Aware Rejection Tuning (DART), a method that allocates difficult queries more trials during the synthesis phase, enabling more extensive training on difficult samples. Utilizing DART, we have created new datasets for mathematical problem-solving that focus more on difficult queries and are substantially smaller than previous ones. Remarkably,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hkust-nlp/dart-math
pytorchOfficial

Models

Datasets

Videos

DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving· slideslive

Taxonomy

TopicsParallel Computing and Optimization Techniques

MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention · Dense Connections