Interpretable Textual Neuron Representations for NLP
Nina Poerner, Benjamin Roth, Hinrich Sch\"utze

TL;DR
This paper adapts input optimization techniques from computer vision to NLP, creating interpretable neuron representations that reveal syntactic differences in language models.
Contribution
It introduces a method using gradient ascent with Gumbel softmax to generate n-gram representations that better activate target neurons in NLP models.
Findings
Gradient ascent with Gumbel softmax outperforms naive corpus search.
Reveals differences in syntax awareness between language and visual models.
Provides interpretable neuron representations for NLP.
Abstract
Input optimization methods, such as Google Deep Dream, create interpretable representations of neurons for computer vision DNNs. We propose and evaluate ways of transferring this technology to NLP. Our results suggest that gradient ascent with a gumbel softmax layer produces n-gram representations that outperform naive corpus search in terms of target neuron activation. The representations highlight differences in syntax awareness between the language and visual models of the Imaginet architecture.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Topic Modeling
MethodsGumbel Softmax · Softmax
