InsBank: Evolving Instruction Subset for Ongoing Alignment

Jiayi Shi; Yiwei Li; Shaoxiong Feng; Peiwen Yuan; Xinglin Wang; Yueqi Zhang; Chuyi Tan; Boyuan Pan; Huan Ren; Yao Hu; Kan Li

arXiv:2502.11419·cs.CL·September 3, 2025

InsBank: Evolving Instruction Subset for Ongoing Alignment

Jiayi Shi, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Huan Ren, Yao Hu, Kan Li

PDF

Open Access 1 Repo 1 Video 4 Reviews

TL;DR

InsBank introduces a continuously evolving instruction data repository for large language models, employing a novel framework PIBE that enhances data selection efficiency and diversity to improve ongoing model alignment.

Contribution

The paper proposes PIBE, a new framework for evolving InsBank effectively, combining diversity and quality scores for better instruction data selection over time.

Findings

01

PIBE outperforms baselines in InsBank evolution

02

Effectively extracts budget-specific instruction subsets

03

Enhances ongoing LLM alignment

Abstract

Large language models (LLMs) typically undergo instruction tuning to enhance alignment. Recent studies emphasize that quality and diversity of instruction data are more crucial than quantity, highlighting the need to select diverse, high-quality subsets to reduce training costs. However, how to evolve these selected subsets alongside the development of new instruction data remains insufficiently explored. To achieve LLMs' ongoing alignment, we introduce Instruction Bank (\textbf{InsBank}), a continuously updated repository that integrates the latest valuable instruction data. We further propose Progressive Instruction Bank Evolution (\textbf{PIBE}), a novel framework designed to evolve InsBank effectively and efficiently over time. PIBE employs a gradual data selection strategy to maintain long-term efficiency, leveraging a representation-based diversity score to capture relationships…

Peer Reviews

Decision·ICLR 2025 Conference Withdrawn Submission

Reviewer 01Rating 5Confidence 4

Strengths

Innovation in Data Management: The concept of InsBank and the PIBE framework addresses a critical need for efficient, ongoing alignment of LLMs with evolving instruction data. Efficiency and Scalability: By retaining only necessary data and historical information, PIBE reduces computational and storage costs, making it suitable for large-scale applications. Comprehensive Diversity Evaluation: The representation-based diversity score effectively captures relationships between data points, impro

Weaknesses

Lack of novelty: While the paper presents the InsBank concept and the PIBE framework, the methods employed largely combine existing techniques without substantial innovation. The use of Affinity Propagation for diversity scoring and simple mathematical operations (addition and multiplication) to combine diversity and quality scores are straightforward applications of known methods. Clarity in Methodology: need more detailed explanations of the experiments to enable result reproducibility. Clari

Reviewer 02Rating 6Confidence 3

Strengths

1. The author consider an insteresting setting of contunually integrate instruction data selection for LLMs. 2. The prosposed method achieves a good performance on AlphacaEval and MT-Bench benchmarks.

Weaknesses

1. The downstream evaluation benchmarks are limited. It would be better if the author conduct more downstream analysis on more benchmarks such as MMLU etc. to showcase the advantage of proposed method.

Reviewer 03Rating 3Confidence 3

Strengths

* The developed method demonstrates superior performance over the considered baseline. * The idea of using affinity propagation for diversity measuring is interesting

Weaknesses

* This paper has weaknesses in problem formulation, contribution, presentation, and experimental design. Please see the summary for details.

Reviewer 04Rating 6Confidence 3

Strengths

1.The introduction of InsBank and the PIBE framework brings a novel solution to the ongoing alignment and evolution of instruction data for LLMs. I think it's a relatively comprehensive and novel framework 2.The adaptation of Affinity Propagation for diversity scoring is well-suited for this progressive approach, enhancing the robustness and representation quality of selected subsets. 3.The authors flexibly integrated quality and diversity scores, allowing PIBE to adapt to various budget constra

Weaknesses

1.The authors focus primarily on widely used datasets. I think it would be possible to evaluate the performance of PIBE on more domain-specific datasets or to evaluate its performance with multiple evaluation methods. 2.The ensemble weights for mass and diversity are not well analyzed, which can lead to issues with the sensitivity of PIBE performance to changes in these parameters.

Code & Models

Repositories

jiayinlp/insbank
pytorchOfficial

Videos

InsBank: Evolving Instruction Subset for Ongoing Alignment· underline

Taxonomy

TopicsHandwritten Text Recognition Techniques · Video Analysis and Summarization · Human Motion and Animation