TL;DR
This paper introduces Answer Divergence-Guided Selection (ADG), a novel method for selecting instruction data based on output diversity, improving instruction tuning performance across multiple benchmarks.
Contribution
ADG is a new data selection technique that leverages geometric analysis of multi-sample outputs to enhance instruction tuning effectiveness.
Findings
ADG-selected data outperforms other selectors on six benchmarks.
Fine-tuning with only 10K ADG examples yields strong results.
Both dispersion magnitude and shape anisotropy are crucial for effective data selection.
Abstract
Instruction tuning relies on large instruction-response corpora whose quality and composition strongly affect downstream performance. We propose Answer Divergence-Guided Selection (ADG), which selects instruction data based on the geometric structure of multi-sample outputs. ADG draws several high-temperature generations per instruction, maps responses into an embedding space, and computes an output divergence score that jointly encodes dispersion magnitude and shape anisotropy. High scores correspond to instructions whose answers are both far apart and multi-modal, rather than clustered paraphrases along a single direction. Across two backbones and three public instruction pools, fine-tuning on only 10K ADG-selected examples consistently outperforms strong selectors on six benchmarks spanning reasoning, knowledge, and coding. Analyses further show that both dispersion magnitude and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
