Diversity-driven Data Selection for Language Model Tuning through Sparse   Autoencoder

Xianjun Yang; Shaoliang Nie; Lijuan Liu; Suchin Gururangan; Ujjwal; Karn; Rui Hou; Madian Khabsa; Yuning Mao

arXiv:2502.14050·cs.CL·April 2, 2025

Diversity-driven Data Selection for Language Model Tuning through Sparse Autoencoder

Xianjun Yang, Shaoliang Nie, Lijuan Liu, Suchin Gururangan, Ujjwal, Karn, Rui Hou, Madian Khabsa, Yuning Mao

PDF

Open Access

TL;DR

This paper introduces a diversity-aware data selection method using sparse autoencoders to improve language model tuning, enhancing model capabilities, reducing training costs, and providing interpretability.

Contribution

The work proposes a novel diversity-driven data selection strategy with sparse autoencoders, addressing the limitations of existing quality-based methods.

Findings

01

Models trained on selected data outperform others in capabilities.

02

The method reduces training costs.

03

SAEs offer interpretability and scalability for large-scale pruning.

Abstract

Instruction tuning data are often quantity-saturated due to the large volume of data collection and fast model iteration, leaving data selection important but underexplored. Existing quality-driven data selection methods, such as LIMA (NeurIPS 2023 \citep{zhou2024lima}) and AlpaGasus (ICLR 2024 \citep{chenalpagasus}) generally ignore the equal importance of data diversity and complexity. In this work, we aim to design a diversity-aware data selection strategy and creatively propose using sparse autoencoders (SAEs) to tackle the challenge of data diversity measure. In addition, SAEs can also provide more interpretability of model behavior and explain, e.g., the surprising effectiveness of selecting the longest response (ICML 2024 \citep{zhaolong}). Using effective data selection, we experimentally prove that models trained on our selected data can outperform other methods in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsALIGN