Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors
Zeyu Yun, Yubei Chen, Bruno A Olshausen, Yann LeCun

TL;DR
This paper introduces a dictionary learning-based visualization method to analyze transformer models, revealing hierarchical semantic structures and providing new insights into their internal representations.
Contribution
It presents a novel visualization approach that models transformer embeddings as linear superpositions of factors, enhancing interpretability of transformer representations.
Findings
Hierarchical semantic structures are captured by transformer factors
Word-level polysemy disambiguation is visualized
Long-range dependencies are identified and analyzed
Abstract
Transformer networks have revolutionized NLP representation learning since they were introduced. Though a great effort has been made to explain the representation in transformers, it is widely recognized that our understanding is not sufficient. One important reason is that there lack enough visualization tools for detailed analysis. In this paper, we propose to use dictionary learning to open up these "black boxes" as linear superpositions of transformer factors. Through visualization, we demonstrate the hierarchical semantic structures captured by the transformer factors, e.g., word-level polysemy disambiguation, sentence-level pattern formation, and long-range dependency. While some of these patterns confirm the conventional prior linguistic knowledge, the rest are relatively unexpected, which may provide new insights. We hope this visualization tool can bring further knowledge and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
