A Novel Estimator of Mutual Information for Learning to Disentangle   Textual Representations

Pierre Colombo; Chloe Clavel; Pablo Piantanida

arXiv:2105.02685·cs.AI·May 7, 2021

A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations

Pierre Colombo, Chloe Clavel, Pablo Piantanida

PDF

TL;DR

This paper proposes a new variational upper bound for mutual information to improve the learning of disentangled textual representations, offering better control over the degree of disentanglement compared to existing methods.

Contribution

It introduces a novel mutual information estimator based on Renyi's divergence, enabling more precise control of disentanglement in text representations and overcoming limitations of adversarial approaches.

Findings

01

Outperforms state-of-the-art methods in fair classification.

02

Achieves superior results in textual style transfer.

03

Provides insights into trade-offs between disentanglement and sentence quality.

Abstract

Learning disentangled representations of textual data is essential for many natural language tasks such as fair classification, style transfer and sentence generation, among others. The existent dominant approaches in the context of text data {either rely} on training an adversary (discriminator) that aims at making attribute values difficult to be inferred from the latent code {or rely on minimising variational bounds of the mutual information between latent code and the value attribute}. {However, the available methods suffer of the impossibility to provide a fine-grained control of the degree (or force) of disentanglement.} {In contrast to} {adversarial methods}, which are remarkably simple, although the adversary seems to be performing perfectly well during the training phase, after it is completed a fair amount of information about the undesired attribute still remains. This paper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.