Select or Project? Evaluating Lower-dimensional Vectors for LLM Training Data Explanations
Lukas Hinterleitner, Loris Schoenegger, Benjamin Roth

TL;DR
This paper compares two methods for creating low-dimensional explanations of large language models: selecting a small, meaningful subset of components versus projecting full gradients, finding that selection is more effective and efficient.
Contribution
It introduces a systematic evaluation of subset selection versus projection for influence estimation in LLM explanations, demonstrating the superiority of targeted component selection.
Findings
Selected subsets outperform full gradients and random projections in influence estimation.
Targeted component selection is more computationally efficient.
The approach improves explanation quality for retrieval tasks.
Abstract
Gradient-based methods for instance-based explanation for large language models (LLMs) are hindered by the immense dimensionality of model gradients. In practice, influence estimation is restricted to a subset of model parameters to make computation tractable, but this subset is often chosen ad hoc and rarely justified by systematic evaluation. This paper investigates if it is better to create low-dimensional representations by selecting a small, architecturally informed subset of model components or by projecting the full gradients into a lower-dimensional space. Using a novel benchmark, we show that a greedily selected subset of components captures the information about training data influence needed for a retrieval task more effectively than either the full gradient or random projection. We further find that this approach is more computationally efficient than random projection,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Topic Modeling
