Explicit Mutual Information Maximization for Self-Supervised Learning
Lele Chang, Peilin Liu, Qinghai Guo, Fei Wen

TL;DR
This paper proposes an explicit mutual information maximization approach for self-supervised learning, deriving a new loss function based on second-order statistics and demonstrating its effectiveness through extensive experiments.
Contribution
It introduces a novel explicit MI maximization method for SSL under a relaxed data distribution assumption, supported by theoretical analysis and empirical validation.
Findings
The new loss function improves SSL performance.
Explicit MI maximization is feasible under relaxed distribution assumptions.
Experimental results validate the effectiveness of the proposed method.
Abstract
Recently, self-supervised learning (SSL) has been extensively studied. Theoretically, mutual information maximization (MIM) is an optimal criterion for SSL, with a strong theoretical foundation in information theory. However, it is difficult to directly apply MIM in SSL since the data distribution is not analytically available in applications. In practice, many existing methods can be viewed as approximate implementations of the MIM criterion. This work shows that, based on the invariance property of MI, explicit MI maximization can be applied to SSL under a generic distribution assumption, i.e., a relaxed condition of the data distribution. We further illustrate this by analyzing the generalized Gaussian distribution. Based on this result, we derive a loss function based on the MIM criterion using only second-order statistics. We implement the new loss for SSL and demonstrate its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Face and Expression Recognition
MethodsMutual Information Machine/Mask Image Modeling
