Latent Geometry of Taste: Scalable Low-Rank Matrix Factorization
By: Joshua Salako
Potential Business Impact:
Finds movies you'll like, even if new.
Scalability and data sparsity remain critical bottlenecks for collaborative filtering on massive interaction datasets. This work investigates the latent geometry of user preferences using the MovieLens 32M dataset, implementing a high-performance, parallelized Alternating Least Squares (ALS) framework. Through extensive hyperparameter optimization, we demonstrate that constrained low-rank models significantly outperform higher dimensional counterparts in generalization, achieving an optimal balance between Root Mean Square Error (RMSE) and ranking precision. We visualize the learned embedding space to reveal the unsupervised emergence of semantic genre clusters, confirming that the model captures deep structural relationships solely from interaction data. Finally, we validate the system's practical utility in a cold-start scenario, introducing a tunable scoring parameter to manage the trade-off between popularity bias and personalized affinity effectively. The codebase for this research can be found here: https://github.com/joshsalako/recommender.git
Similar Papers
Core-elements Subsampling for Alternating Least Squares
Methodology
Makes movie suggestions faster for everyone.
On the Mechanisms of Collaborative Learning in VAE Recommenders
Machine Learning (CS)
Helps streaming sites recommend better movies.
Recommendations with Sparse Comparison Data: Provably Fast Convergence for Nonconvex Matrix Factorization
Machine Learning (CS)
Learns what you like from item comparisons.