GRIP: A Graph-Based Reasoning Instruction Producer
Jiankang Wang, Jianjun Xu, Xiaorui Wang, Yuxin Wang, Mengting Xing, Shancheng Fang, Hongtao Xie

TL;DR
GRIP is a graph-based method for efficiently generating diverse, high-quality reasoning instructions, significantly enhancing large language models' reasoning abilities with scalable synthetic data.
Contribution
Introduces GRIP, a novel graph-based approach that constructs knowledge graphs from seed data to synthesize large-scale, diverse reasoning instructions for LLM training.
Findings
Generated 2.1 million math reasoning samples from 7.5K seeds.
Models trained on GRIP-MATH outperform baselines on reasoning benchmarks.
Achieved greater scalability and diversity with reduced costs.
Abstract
Large-scale, high-quality data is essential for advancing the reasoning capabilities of large language models (LLMs). As publicly available Internet data becomes increasingly scarce, synthetic data has emerged as a crucial research direction. However, existing data synthesis methods often suffer from limited scalability, insufficient sample diversity, and a tendency to overfit to seed data, which constrains their practical utility. In this paper, we present \textit{\textbf{GRIP}}, a \textbf{G}raph-based \textbf{R}easoning \textbf{I}nstruction \textbf{P}roducer that efficiently synthesizes high-quality and diverse reasoning instructions. \textit{GRIP} constructs a knowledge graph by extracting high-level concepts from seed data, and uniquely leverages both explicit and implicit relationships within the graph to drive large-scale and diverse instruction data synthesis, while employing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning · Educational Technology and Assessment
