Concept Based Continuous Prompts for Interpretable Text Classification
Qian Chen, Dongyang Li, Xiaofeng He

TL;DR
This paper introduces a concept-based interpretability framework for continuous prompts in text classification, decomposing prompts into human-readable concepts to enhance understanding and maintain performance.
Contribution
It proposes a novel method to interpret continuous prompts by decomposing them into concepts using a concept pool and submodular optimization, improving interpretability without sacrificing accuracy.
Findings
Achieves similar performance to original P-tuning with fewer concepts
Provides more plausible and human-understandable prompt interpretations
Demonstrates the feasibility of concept decomposition for continuous prompts
Abstract
Continuous prompts have become widely adopted for augmenting performance across a wide range of natural language tasks. However, the underlying mechanism of this enhancement remains obscure. Previous studies rely on individual words for interpreting continuous prompts, which lacks comprehensive semantic understanding. Drawing inspiration from Concept Bottleneck Models, we propose a framework for interpreting continuous prompts by decomposing them into human-readable concepts. Specifically, to ensure the feasibility of the decomposition, we demonstrate that a corresponding concept embedding matrix and a coefficient matrix can always be found to replace the prompt embedding matrix. Then, we employ GPT-4o to generate a concept pool and choose potential candidate concepts that are discriminative and representative using a novel submodular optimization algorithm. Experiments demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
