Enhancing In-context Learning via Linear Probe Calibration
Momin Abbas, Yi Zhou, Parikshit Ram, Nathalie Baracaldo and, Horst Samulowitz, Theodoros Salonidis, Tianyi Chen

TL;DR
This paper introduces LinC, a calibration method that improves the reliability and performance of in-context learning with GPT models, especially in low-resource and variable prompt scenarios.
Contribution
LinC is a novel calibration technique that enhances ICL robustness and accuracy with minimal additional data, addressing scalability and stability issues.
Findings
LinC improves GPT ICL performance by up to 21%.
LinC reduces calibration error and increases robustness.
Effective in low-resource and permutation-varying settings.
Abstract
In-context learning (ICL) is a new paradigm for natural language processing that utilizes Generative Pre-trained Transformer (GPT)-like models. This approach uses prompts that include in-context demonstrations to generate the corresponding output for a new query input. However, applying ICL in real cases does not scale with the number of samples, and lacks robustness to different prompt templates and demonstration permutations. In this paper, we first show that GPT-like models using ICL result in unreliable predictions based on a new metric based on Shannon entropy. Then, to solve this problem, we propose a new technique called the Linear Probe Calibration (LinC), a method that calibrates the model's output probabilities, resulting in reliable predictions and improved performance, while requiring only minimal additional samples (as few as five labeled data samples). LinC significantly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Natural Language Processing Techniques
MethodsAttention Is All You Need · Absolute Position Encodings · Label Smoothing · Residual Connection · Dropout · Linear Layer · Byte Pair Encoding · Adam · Multi-Head Attention · Softmax
