Disentangling Representations of Text by Masking Transformers

Xiongyi Zhang; Jan-Willem van de Meent; Byron C. Wallace

arXiv:2104.07155·cs.CL·September 14, 2021

Disentangling Representations of Text by Masking Transformers

Xiongyi Zhang, Jan-Willem van de Meent, Byron C. Wallace

PDF

Open Access

TL;DR

This paper proposes a method to identify and extract disentangled, aspect-specific subnetworks within pretrained transformer models like BERT by learning binary masks, enabling targeted interpretability and improved task performance.

Contribution

It introduces a masking-based approach to discover sparse subnetworks within BERT that encode distinct features, avoiding the need for training separate models for each task.

Findings

01

Subnetworks strongly encode specific aspects like toxicity or sentiment

02

Masking combined with pruning identifies sparse, interpretable subnetworks

03

Disentanglement via masking matches or exceeds prior methods in effectiveness

Abstract

Representations from large pretrained models such as BERT encode a range of features into monolithic vectors, affording strong predictive accuracy across a multitude of downstream tasks. In this paper we explore whether it is possible to learn disentangled representations by identifying existing subnetworks within pretrained models that encode distinct, complementary aspect representations. Concretely, we learn binary masks over transformer weights or hidden units to uncover subsets of features that correlate with a specific factor of variation; this eliminates the need to train a disentangled model from scratch for a particular task. We evaluate this method with respect to its ability to disentangle representations of sentiment from genre in movie reviews, "toxicity" from dialect in Tweets, and syntax from semantics. By combining masking with magnitude pruning we find that we can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods

MethodsMulti-Head Attention · Attention Is All You Need · Pruning · Linear Layer · Dense Connections · Softmax · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Weight Decay · WordPiece