A Method of Moments Embedding Constraint and its Application to   Semi-Supervised Learning

Michael Majurski; Sumeet Menon; Parniyan Farvardin; David Chapman

arXiv:2404.17978·cs.CV·April 30, 2024

A Method of Moments Embedding Constraint and its Application to Semi-Supervised Learning

Michael Majurski, Sumeet Menon, Parniyan Farvardin, David Chapman

PDF

TL;DR

This paper introduces a Method of Moments embedding constraint combined with an Axis-Aligned Gaussian Mixture Model layer to improve semi-supervised learning by modeling joint distributions and reducing outlier sensitivity.

Contribution

It proposes a novel MoM-based embedding constraint and a GMM layer for semi-supervised learning, addressing outlier detection and joint distribution modeling.

Findings

01

MoM constraint matches FlexMatch accuracy

02

GMM layer models joint distribution effectively

03

Reduced outlier sensitivity in semi-supervised classification

Abstract

Discriminative deep learning models with a linear+softmax final layer have a problem: the latent space only predicts the conditional probabilities $p (Y ∣ X)$ but not the full joint distribution $p (Y, X)$ , which necessitates a generative approach. The conditional probability cannot detect outliers, causing outlier sensitivity in softmax networks. This exacerbates model over-confidence impacting many problems, such as hallucinations, confounding biases, and dependence on large datasets. To address this we introduce a novel embedding constraint based on the Method of Moments (MoM). We investigate the use of polynomial moments ranging from 1st through 4th order hyper-covariance matrices. Furthermore, we use this embedding constraint to train an Axis-Aligned Gaussian Mixture Model (AAGMM) final layer, which learns not only the conditional, but also the joint distribution of the latent space. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax