Effective In-Context Example Selection through Data Compression

Zhongxiang Sun; Kepu Zhang; Haoyu Wang; Xiao Zhang; Jun Xu

arXiv:2405.11465·cs.CL·May 21, 2024

Effective In-Context Example Selection through Data Compression

Zhongxiang Sun, Kepu Zhang, Haoyu Wang, Xiao Zhang, Jun Xu

PDF

Open Access

TL;DR

This paper introduces a data compression-based method for selecting in-context examples in large language models, significantly improving their performance by effectively choosing relevant data that retains essential information.

Contribution

It proposes a novel two-stage data compression approach for in-context example selection, enhancing relevance and information retention in large language models.

Findings

01

Average performance improvement of 5.90% across datasets

02

Effective selection of relevant examples improves model accuracy

03

Method applicable to multiple language models

Abstract

In-context learning has been extensively validated in large language models. However, the mechanism and selection strategy for in-context example selection, which is a crucial ingredient in this approach, lacks systematic and in-depth research. In this paper, we propose a data compression approach to the selection of in-context examples. We introduce a two-stage method that can effectively choose relevant examples and retain sufficient information about the training dataset within the in-context examples. Our method shows a significant improvement of an average of 5.90% across five different real-world datasets using four language models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications