TL;DR
This paper introduces CORTEX, a novel interpretability framework for Vector-Quantized Generative Models, identifying concept-specific tokens to improve understanding and applications like image editing.
Contribution
CORTEX provides a new method for interpreting VQGMs by analyzing token importance at sample and codebook levels, enhancing transparency and usability.
Findings
CORTEX outperforms baselines in explaining token usage.
It enables targeted image editing and shortcut detection.
The framework offers both local and global explanations.
Abstract
Vector-Quantized Generative Models (VQGMs) have emerged as powerful tools for image generation. However, the key component of VQGMs -- the codebook of discrete tokens -- is still not well understood, e.g., which tokens are critical to generate an image of a certain concept? This paper introduces Concept-Oriented Token Explanation (CORTEX), a novel approach for interpreting VQGMs by identifying concept-specific token combinations. Our framework employs two methods: (1) a sample-level explanation method that analyzes token importance scores in individual images, and (2) a codebook-level explanation method that explores the entire codebook to find globally relevant tokens. Experimental results demonstrate CORTEX's efficacy in providing clear explanations of token usage in the generative process, outperforming baselines across multiple pretrained VQGMs. Besides enhancing VQGMs transparency,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
