Enhancing In-context Learning via Linear Probe Calibration

Momin Abbas; Yi Zhou; Parikshit Ram; Nathalie Baracaldo and; Horst Samulowitz; Theodoros Salonidis; Tianyi Chen

arXiv:2401.12406·cs.CL·January 24, 2024·1 cites

Enhancing In-context Learning via Linear Probe Calibration

Momin Abbas, Yi Zhou, Parikshit Ram, Nathalie Baracaldo and, Horst Samulowitz, Theodoros Salonidis, Tianyi Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces LinC, a calibration method that improves the reliability and performance of in-context learning with GPT models, especially in low-resource and variable prompt scenarios.

Contribution

LinC is a novel calibration technique that enhances ICL robustness and accuracy with minimal additional data, addressing scalability and stability issues.

Findings

01

LinC improves GPT ICL performance by up to 21%.

02

LinC reduces calibration error and increases robustness.

03

Effective in low-resource and permutation-varying settings.

Abstract

In-context learning (ICL) is a new paradigm for natural language processing that utilizes Generative Pre-trained Transformer (GPT)-like models. This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input. However, applying ICL in real cases does not scale with the number of samples, and lacks robustness to different prompt templates and demonstration permutations. In this paper, we first show that GPT-like models using ICL result in unreliable predictions based on a new metric based on Shannon entropy. Then, to solve this problem, we propose a new technique called the Linear Probe Calibration (LinC), a method that calibrates the model's output probabilities, resulting in reliable predictions and improved performance, while requiring only minimal additional samples (as few as five labeled data samples). LinC significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mominabbass/linc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Data Classification · Natural Language Processing Techniques

MethodsAttention Is All You Need · Absolute Position Encodings · Label Smoothing · Residual Connection · Dropout · Linear Layer · Byte Pair Encoding · Adam · Multi-Head Attention · Softmax