SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware   Self-Reflection

Liangxin Liu; Xuebo Liu; Derek F. Wong; Dongfang Li; Ziyi Wang,; Baotian Hu; Min Zhang

arXiv:2402.16705·cs.CL·January 16, 2025·1 cites

SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection

Liangxin Liu, Xuebo Liu, Derek F. Wong, Dongfang Li, Ziyi Wang,, Baotian Hu, Min Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

SelectIT introduces an uncertainty-aware self-reflection method that enables efficient instruction data selection for LLMs, improving performance without additional models or data, and demonstrates its effectiveness across models and domains.

Contribution

It proposes a novel LLM-based data selection approach for instruction tuning, creating the Selective Alpaca dataset and showing improved model performance without extra resource requirements.

Findings

01

Selective Alpaca enhances model abilities.

02

SelectIT improves robustness across models.

03

Longer, high-quality data boosts instruction tuning.

Abstract

Instruction tuning (IT) is crucial to tailoring large language models (LLMs) towards human-centric interactions. Recent advancements have shown that the careful selection of a small, high-quality subset of IT data can significantly enhance the performance of LLMs. Despite this, common approaches often rely on additional models or data, which increases costs and limits widespread adoption. In this work, we propose a novel approach, termed SelectIT, that capitalizes on the foundational capabilities of the LLM itself. Specifically, we exploit the intrinsic uncertainty present in LLMs to more effectively select high-quality IT data, without the need for extra resources. Furthermore, we introduce a curated IT dataset, the Selective Alpaca, created by applying SelectIT to the Alpaca-GPT4 dataset. Empirical results demonstrate that IT using Selective Alpaca leads to substantial model ability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

blue-raincoat/selectit
pytorchOfficial

Videos

SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection· slideslive

Taxonomy

TopicsTopic Modeling