Score: 2

Generalization Bounds for Semi-supervised Matrix Completion with Distributional Side Information

Published: November 17, 2025 | arXiv ID: 2511.13049v1

By: Antoine Ledent, Mun Chong Soo, Nong Minh Hieu

BigTech Affiliations: Binance

Potential Business Impact:

Helps movie apps guess what you'll like.

Business Areas:

Crowdsourcing Collaboration

We study a matrix completion problem where both the ground truth $R$ matrix and the unknown sampling distribution $P$ over observed entries are low-rank matrices, and \textit{share a common subspace}. We assume that a large amount $M$ of \textit{unlabeled} data drawn from the sampling distribution $P$ is available, together with a small amount $N$ of labeled data drawn from the same distribution and noisy estimates of the corresponding ground truth entries. This setting is inspired by recommender systems scenarios where the unlabeled data corresponds to `implicit feedback' (consisting in interactions such as purchase, click, etc. ) and the labeled data corresponds to the `explicit feedback', consisting of interactions where the user has given an explicit rating to the item. Leveraging powerful results from the theory of low-rank subspace recovery, together with classic generalization bounds for matrix completion models, we show error bounds consisting of a sum of two error terms scaling as $\widetilde{O}\left(\sqrt{\frac{nd}{M}}\right)$ and $\widetilde{O}\left(\sqrt{\frac{dr}{N}}\right)$ respectively, where $d$ is the rank of $P$ and $r$ is the rank of $M$. In synthetic experiments, we confirm that the true generalization error naturally splits into independent error terms corresponding to the estimations of $P$ and and the ground truth matrix $\ground$ respectively. In real-life experiments on Douban and MovieLens with most explicit ratings removed, we demonstrate that the method can outperform baselines relying only on the explicit ratings, demonstrating that our assumptions provide a valid toy theoretical setting to study the interaction between explicit and implicit feedbacks in recommender systems.

Generalization Bounds for Semi-supervised Matrix Completion with Distributional Side Information

Machine Learning (CS)

Helps movie sites guess what you'll like.

17 Nov 2025 2

87%

Low rank matrix completion and realization of graphs: results and problems

History and Overview

Helps Netflix guess movies you'll like.

10 Jan 2025 0

87%

Match Made with Matrix Completion: Efficient Learning under Matching Interference

Machine Learning (Stat)

Helps match people to jobs faster with less data.

11 Jan 2026 1

View PDF Login to Bookmark

Country of Origin

🇸🇬 🇰🇾 Cayman Islands, Singapore

Page Count

21 pages

Generalization Bounds for Semi-supervised Matrix Completion with Distributional Side Information

Helps movie apps guess what you'll like.

Technical Abstract

Generalization Bounds for Semi-supervised Matrix Completion with Distributional Side Information

Low rank matrix completion and realization of graphs: results and problems

Match Made with Matrix Completion: Efficient Learning under Matching Interference