MMbeddings: Parameter-Efficient, Low-Overfitting Probabilistic Embeddings Inspired by Nonlinear Mixed Models
Giora Simchoni, Saharon Rosset

TL;DR
MMbeddings introduces a probabilistic embedding method inspired by nonlinear mixed models, reducing parameters and overfitting in high-cardinality scenarios, and outperforming traditional embeddings across various tasks.
Contribution
The paper proposes MMbeddings, a novel probabilistic embedding approach that significantly reduces parameters and overfitting by modeling embeddings as latent effects within a variational autoencoder framework.
Findings
Outperforms traditional embeddings in diverse tasks
Reduces parameters from cardinality-dependent to architecture-dependent
Mitigates overfitting in high-cardinality settings
Abstract
We present MMbeddings, a probabilistic embedding approach that reinterprets categorical embeddings through the lens of nonlinear mixed models, effectively bridging classical statistical theory with modern deep learning. By treating embeddings as latent random effects within a variational autoencoder framework, our method substantially decreases the number of parameters -- from the conventional embedding approach of cardinality embedding dimension, which quickly becomes infeasible with large cardinalities, to a significantly smaller, cardinality-independent number determined primarily by the encoder architecture. This reduction dramatically mitigates overfitting and computational burden in high-cardinality settings. Extensive experiments on simulated and real datasets, encompassing collaborative filtering and tabular regression tasks using varied architectures, demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
