Understanding and Scaling Collaborative Filtering Optimization from the Perspective of Matrix Rank
Donald Loveland, Xinyi Wu, Tong Zhao, Danai Koutra, Neil Shah, Mingxuan Ju

TL;DR
This paper analyzes collaborative filtering embeddings through matrix rank properties, revealing how stable rank influences training efficiency and performance, and proposes a regularization strategy to improve scalability and effectiveness.
Contribution
It introduces a theoretical link between CF loss functions and matrix singular values, and proposes a stable rank regularization method to enhance training speed and recommendation quality.
Findings
Higher stable rank improves training speed by up to 66%.
Stable rank regularization can substitute negative sampling, boosting performance by 21%.
Theoretical analysis connects CF loss functions with matrix singular values.
Abstract
Collaborative Filtering (CF) methods dominate real-world recommender systems given their ability to learn high-quality, sparse ID-embedding tables that effectively capture user preferences. These tables scale linearly with the number of users and items, and are trained to ensure high similarity between embeddings of interacted user-item pairs, while maintaining low similarity for non-interacted pairs. Despite their high performance, encouraging dispersion for non-interacted pairs necessitates expensive regularization (e.g., negative sampling), hurting runtime and scalability. Existing research tends to address these challenges by simplifying the learning process, either by reducing model complexity or sampling data, trading performance for runtime. In this work, we move beyond model-level modifications and study the properties of the embedding tables under different learning strategies.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsE-commerce and Technology Innovations · Educational Reforms and Innovations · Medical Research and Treatments
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
