An Empirical Study of Sample Selection Strategies for Large Language Model Repair

Xuran Li; Jingyi Wang

arXiv:2510.20428·cs.LG·October 24, 2025

An Empirical Study of Sample Selection Strategies for Large Language Model Repair

Xuran Li, Jingyi Wang

PDF

Open Access

TL;DR

This study systematically compares sample selection strategies for repairing large language models, finding that semantic-aware prioritization offers the best balance of effectiveness and efficiency, with simpler methods often sufficing for large models.

Contribution

It introduces and evaluates five sample selection methods for LLM repair, highlighting the effectiveness of a new semantic-aware approach and providing insights into optimal data proportions and trade-offs.

Findings

01

SAPS achieves superior balance of detoxification and utility preservation.

02

Random sampling is effective for large or robust models.

03

High-overhead methods like CCS and GraNd offer limited benefits.

Abstract

Large language models (LLMs) are increasingly deployed in real-world systems, yet they can produce toxic or biased outputs that undermine safety and trust. Post-hoc model repair provides a practical remedy, but the high cost of parameter updates motivates selective use of repair data. Despite extensive prior work on data selection for model training, it remains unclear which sampling criteria are most effective and efficient when applied specifically to behavioral repair of large generative models. Our study presents a systematic analysis of sample prioritization strategies for LLM repair. We evaluate five representative selection methods, including random sampling, K-Center, gradient-norm-based selection(GraNd), stratified coverage (CCS), and a Semantic-Aware Prioritized Sampling (SAPS) approach we proposed. Repair effectiveness and trade-offs are assessed through toxicity reduction,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Adversarial Robustness in Machine Learning