Unsupervised Representation Disentanglement of Text: An Evaluation on   Synthetic Datasets

Lan Zhang; Victor Prokhorov; Ehsan Shareghi

arXiv:2106.03631·cs.CL·June 8, 2021·1 cites

Unsupervised Representation Disentanglement of Text: An Evaluation on Synthetic Datasets

Lan Zhang, Victor Prokhorov, Ehsan Shareghi

PDF

Open Access 1 Repo

TL;DR

This paper evaluates the challenges of unsupervised representation disentanglement in text using synthetic datasets, adapting models from image domain and highlighting key factors affecting disentanglement quality.

Contribution

It introduces the first framework and datasets for assessing unsupervised disentanglement in text, bridging a gap between image and text representation learning.

Findings

01

Disentanglement is more challenging in text than in images.

02

Representation sparsity influences disentanglement success.

03

Coupling with the decoder impacts disentanglement quality.

Abstract

To highlight the challenges of achieving representation disentanglement for text domain in an unsupervised setting, in this paper we select a representative set of successfully applied models from the image domain. We evaluate these models on 6 disentanglement metrics, as well as on downstream classification tasks and homotopy. To facilitate the evaluation, we propose two synthetic datasets with known generative factors. Our experiments highlight the existing gap in the text domain and illustrate that certain elements such as representation sparsity (as an inductive bias), or representation coupling with the decoder could impact disentanglement. To the best of our knowledge, our work is the first attempt on the intersection of unsupervised representation disentanglement and text, and provides the experimental framework and datasets for examining future developments in this direction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lanzhang128/disentanglement
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning