Token-based Decision Criteria Are Suboptimal in In-context Learning

Hakaze Cho; Yoshihiro Sakai; Mariko Kato; Kenshiro Tanaka; Akira; Ishii; Naoya Inoue

arXiv:2406.16535·cs.CL·February 6, 2025

Token-based Decision Criteria Are Suboptimal in In-context Learning

Hakaze Cho, Yoshihiro Sakai, Mariko Kato, Kenshiro Tanaka, Akira, Ishii, Naoya Inoue

PDF

Open Access 1 Repo

TL;DR

This paper introduces Hidden Calibration, a new classification method for in-context learning that replaces token probability-based criteria with a nearest centroid classifier on hidden states, significantly improving performance.

Contribution

It proposes Hidden Calibration, a novel approach that uses hidden state clustering instead of token probabilities, achieving state-of-the-art results in ICL tasks.

Findings

01

Hidden Calibration outperforms token-based methods by 20-50%.

02

LM hidden states form linearly separable intra-class clusters.

03

Better classification criteria reduce inter-class overlap.

Abstract

In-Context Learning (ICL) typically utilizes classification criteria from output probabilities of manually selected label tokens. However, we argue that such token-based classification criteria lead to suboptimal decision boundaries, despite delicate calibrations through translation and constrained rotation applied. To address this problem, we propose Hidden Calibration, which renounces token probabilities and uses the nearest centroid classifier on the LM's last hidden states. In detail, we assign the label of the nearest centroid previously estimated from a calibration set to the test sample as the predicted label. Our experiments on 6 models and 10 classification datasets indicate that Hidden Calibration consistently outperforms current token-based baselines by about 20%~50%, achieving a strong state-of-the-art in ICL. Our further analysis demonstrates that Hidden Calibration finds…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hc495/Hidden_Calibration
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Intelligent Tutoring Systems and Adaptive Learning · Reinforcement Learning in Robotics

MethodsSparse Evolutionary Training