Learning from Generalization Patterns: An Evaluation-Driven Approach to Enhanced Data Augmentation for Fine-Tuning Small Language Models

Huan Song; Deeksha Razdan; Yiyue Qian; Arijit Ghosh Chowdhury; Parth Patwa; Aman Chadha; Shinan Zhang; Sharlina Keshava; Hannah Marlowe

arXiv:2510.18143·cs.AI·October 22, 2025

Learning from Generalization Patterns: An Evaluation-Driven Approach to Enhanced Data Augmentation for Fine-Tuning Small Language Models

Huan Song, Deeksha Razdan, Yiyue Qian, Arijit Ghosh Chowdhury, Parth Patwa, Aman Chadha, Shinan Zhang, Sharlina Keshava, Hannah Marlowe

PDF

Open Access

TL;DR

This paper introduces PaDA-Agent, an evaluation-driven data augmentation method that identifies failure patterns in small language models to improve their fine-tuning performance, especially for domain-specific tasks.

Contribution

It proposes a novel, evaluation-based approach for targeted data augmentation that directly reduces the generalization gap in small language models.

Findings

01

Significant performance improvements over existing augmentation methods.

02

Effective discovery of failure patterns from validation data.

03

Enhanced fine-tuning results on Llama 3.2 1B Instruct model.

Abstract

Small Language Models (SLMs) offer compelling advantages in deployment cost and latency, but their accuracy often lags behind larger models, particularly for complex domain-specific tasks. While supervised fine-tuning can help bridge this performance gap, it requires substantial manual effort in data preparation and iterative optimization. We present PaDA-Agent (Pattern-guided Data Augmentation Agent), an evaluation-driven approach that streamlines the data augmentation process for SLMs through coordinated operations. Unlike state-of-the-art approaches that focus on model training errors only and generating error-correcting samples, PaDA-Agent discovers failure patterns from the validation data via evaluations and drafts targeted data augmentation strategies aiming to directly reduce the generalization gap. Our experimental results demonstrate significant improvements over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Machine Learning and Algorithms