LLM Probing with Contrastive Eigenproblems: Improving Understanding and Applicability of CCS
Stefan F. Schouten, Peter Bloem

TL;DR
This paper enhances the understanding of Contrast-Consistent Search (CCS) by reformulating it as an eigenproblem, leading to more interpretable solutions and broader applicability in probing large language models.
Contribution
It introduces a reformulation of CCS as an eigenproblem based on relative contrast consistency, improving interpretability and robustness of probing methods.
Findings
Eigenproblem formulation yields similar performance to original CCS.
Reformulation reduces sensitivity to random initialization.
Extends applicability of CCS to multiple variables.
Abstract
Contrast-Consistent Search (CCS) is an unsupervised probing method able to test whether large language models represent binary features, such as sentence truth, in their internal activations. While CCS has shown promise, its two-term objective has been only partially understood. In this work, we revisit CCS with the aim of clarifying its mechanisms and extending its applicability. We argue that what should be optimized for, is relative contrast consistency. Building on this insight, we reformulate CCS as an eigenproblem, yielding closed-form solutions with interpretable eigenvalues and natural extensions to multiple variables. We evaluate these approaches across a range of datasets, finding that they recover similar performance to CCS, while avoiding problems around sensitivity to random initialization. Our results suggest that relativizing contrast consistency not only improves our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques
