Crafting Interpretable Embeddings by Asking LLMs Questions

Vinamra Benara; Chandan Singh; John X. Morris; Richard Antonello; Ion; Stoica; Alexander G. Huth; Jianfeng Gao

arXiv:2405.16714·cs.CL·May 28, 2024·2 cites

Crafting Interpretable Embeddings by Asking LLMs Questions

Vinamra Benara, Chandan Singh, John X. Morris, Richard Antonello, Ion, Stoica, Alexander G. Huth, Jianfeng Gao

PDF

Open Access 3 Repos

TL;DR

This paper introduces question-answering embeddings (QA-Emb), a novel method for creating interpretable text embeddings by querying large language models, which enhances understanding of semantic brain responses and NLP tasks.

Contribution

The paper proposes QA-Emb, a new interpretable embedding method based on question-answering prompts to LLMs, reducing complexity and improving interpretability over existing models.

Findings

01

QA-Emb outperforms established interpretable baselines in predicting fMRI responses.

02

QA-Emb requires very few questions to generate effective embeddings.

03

QA-Emb can be approximated with an efficient model for broader applications.

Abstract

Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. However, their opaqueness and proliferation into scientific domains such as neuroscience have created a growing need for interpretability. Here, we ask whether we can obtain interpretable embeddings through LLM prompting. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. Training QA-Emb reduces to selecting a set of underlying questions rather than learning model weights. We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli. QA-Emb significantly outperforms an established interpretable baseline, and does so while requiring very few questions. This paves the way towards building flexible feature spaces that can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Law

MethodsSparse Evolutionary Training