Correlation and Navigation in the Vocabulary Key Representation Space of Language Models
Letian Peng, Chenyang An, Jingbo Shang

TL;DR
This paper investigates how the similarity of vocabulary keys in language models affects token prediction, revealing biases that hinder diversity and proposing an in-context method to improve navigation and generation quality.
Contribution
It introduces a novel in-context navigation method that reduces bias from key similarity, enhancing diversity and accuracy in language model decoding.
Findings
Top-ranked tokens are accurate, but middle-ranked are biased towards similar tokens.
The proposed method improves decoding diversity and reasoning performance.
Navigation away from explored keys enhances generation quality.
Abstract
Language model (LM) decoding is based on the next-token prediction (NTP) probability distribution. For neural LMs (e.g., Transformer-based), NTP distribution is essentially a softmax-regularized dot product between an encoded input context (query) and fixed vocabulary representations (keys). In this paper, we study the effect of the key distribution on the NTP distribution, with a focus on whether the similarity between keys will trigger spurious correlations in NTP. Through knowledge-probing tasks, we show that in the NTP distribution, the few top-ranked tokens are typically accurate. However, the middle-ranked prediction is highly biased towards the tokens that are distributionally (not necessarily semantically) similar to these top ones. For instance, if "P" is predicted as the top-1 token, "A"-"Z" will all be ranked high in NTP, no matter whether they can lead to correct decoding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus
