Representation Learning with Conditional Information Flow Maximization

Dou Hu; Lingwei Wei; Wei Zhou; Songlin Hu

arXiv:2406.05510·cs.LG·August 13, 2024

Representation Learning with Conditional Information Flow Maximization

Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel information-theoretic framework called conditional information flow maximization that enhances language model representations by balancing information maximization and minimization to improve robustness, transferability, and task performance.

Contribution

It proposes a new representation learning method that maximizes mutual information with labels while minimizing redundant input features, addressing over-compression and feature redundancy issues.

Findings

01

Improves performance on 13 language understanding benchmarks.

02

Produces more sufficient, robust, and transferable representations.

03

Enhances generalization of pre-trained language models.

Abstract

This paper proposes an information-theoretic representation learning framework, named conditional information flow maximization, to extract noise-invariant sufficient representations for the input data and target task. It promotes the learned representations have good feature uniformity and sufficient predictive ability, which can enhance the generalization of pre-trained language models (PLMs) for the target task. Firstly, an information flow maximization principle is proposed to learn more sufficient representations for the input and target by simultaneously maximizing both input-representation and representation-label mutual information. Unlike the information bottleneck, we handle the input-representation information in an opposite way to avoid the over-compression issue of latent representations. Besides, to mitigate the negative effect of potential redundant features from the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zerohd4869/CIFM
pytorchOfficial

Videos

Representation Learning with Conditional Information Flow Maximization· underline

Taxonomy

TopicsNeural Networks and Applications · Data Stream Mining Techniques