Efficient Response Generation Strategy Selection for Fine-Tuning Large Language Models Through Self-Aligned Perplexity

Xuan Ren; Qi Chen; Lingqiao Liu

arXiv:2502.11779·cs.CL·August 28, 2025

Efficient Response Generation Strategy Selection for Fine-Tuning Large Language Models Through Self-Aligned Perplexity

Xuan Ren, Qi Chen, Lingqiao Liu

PDF

Open Access

TL;DR

This paper introduces self-aligned perplexity, a new metric to select the best data generation strategy for fine-tuning large language models, leading to improved performance on reasoning benchmarks.

Contribution

It proposes a scalable method using self-aligned perplexity to identify optimal data generation strategies for fine-tuning LLMs, addressing variability in training data quality.

Findings

01

Self-aligned perplexity better captures model familiarity than traditional perplexity.

02

Selecting data generation strategies with this metric improves fine-tuned model performance.

03

The method is effective across diverse reasoning benchmarks.

Abstract

Fine-tuning large language models (LLMs) typically relies on producing large sets of input-output pairs. Yet for a given question, there can be many valid outputs. In practice, these outputs are often derived by distilling knowledge from teacher models, and they can vary depending on the specific teacher model or prompting strategy employed. Recent findings show that how these training outputs are generated can significantly affect the performance of the fine-tuned model, raising an important question: how do we pick the best data generation method from among numerous possibilities? Rather than exhaustively training and evaluating on each candidate, this paper proposes a scalable approximate method that assesses a small subset of generated data to estimate its suitability for a specific target LLM. Our central idea is that effective outputs should be familiar to the target LLM. While…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Topic Modeling · Speech Recognition and Synthesis