Latent Geometry of Taste: Scalable Low-Rank Matrix Factorization for Recommender Systems
Joshua Salako

TL;DR
This paper presents a scalable low-rank matrix factorization approach for recommender systems that captures latent semantic structures and improves generalization, especially in cold-start scenarios, demonstrated on the MovieLens 32M dataset.
Contribution
It introduces a high-performance parallelized ALS framework with hyperparameter optimization, revealing semantic genre clusters and balancing bias and personalization in recommendations.
Findings
Low-rank models outperform higher-dimensional ones in RMSE and ranking
Semantic genre clusters emerge from interaction data
Tunable parameter improves cold-start recommendation quality
Abstract
Scalability and data sparsity remain critical bottlenecks for collaborative filtering on massive interaction datasets. This work investigates the latent geometry of user preferences using the MovieLens 32M dataset, implementing a high-performance, parallelized Alternating Least Squares (ALS) framework. Through extensive hyperparameter optimization, we demonstrate that constrained low-rank models significantly outperform higher dimensional counterparts in generalization, achieving an optimal balance between Root Mean Square Error (RMSE) and ranking precision. We visualize the learned embedding space to reveal the unsupervised emergence of semantic genre clusters, confirming that the model captures deep structural relationships solely from interaction data. Finally, we validate the system's practical utility in a cold-start scenario, introducing a tunable scoring parameter to manage the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Mobile Crowdsensing and Crowdsourcing · Sentiment Analysis and Opinion Mining
