MMbeddings: Parameter-Efficient, Low-Overfitting Probabilistic Embeddings Inspired by Nonlinear Mixed Models

Giora Simchoni; Saharon Rosset

arXiv:2510.22198·stat.ML·November 4, 2025

MMbeddings: Parameter-Efficient, Low-Overfitting Probabilistic Embeddings Inspired by Nonlinear Mixed Models

Giora Simchoni, Saharon Rosset

PDF

TL;DR

MMbeddings introduces a probabilistic embedding method inspired by nonlinear mixed models, reducing parameters and overfitting in high-cardinality scenarios, and outperforming traditional embeddings across various tasks.

Contribution

The paper proposes MMbeddings, a novel probabilistic embedding approach that significantly reduces parameters and overfitting by modeling embeddings as latent effects within a variational autoencoder framework.

Findings

01

Outperforms traditional embeddings in diverse tasks

02

Reduces parameters from cardinality-dependent to architecture-dependent

03

Mitigates overfitting in high-cardinality settings

Abstract

We present MMbeddings, a probabilistic embedding approach that reinterprets categorical embeddings through the lens of nonlinear mixed models, effectively bridging classical statistical theory with modern deep learning. By treating embeddings as latent random effects within a variational autoencoder framework, our method substantially decreases the number of parameters -- from the conventional embedding approach of cardinality $\times$ embedding dimension, which quickly becomes infeasible with large cardinalities, to a significantly smaller, cardinality-independent number determined primarily by the encoder architecture. This reduction dramatically mitigates overfitting and computational burden in high-cardinality settings. Extensive experiments on simulated and real datasets, encompassing collaborative filtering and tabular regression tasks using varied architectures, demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.