Generalization Bounds for Semi-supervised Matrix Completion with Distributional Side Information

Antoine Ledent; Mun Chong Soo; Nong Minh Hieu

arXiv:2511.13049·cs.LG·November 24, 2025

Generalization Bounds for Semi-supervised Matrix Completion with Distributional Side Information

Antoine Ledent, Mun Chong Soo, Nong Minh Hieu

PDF

Open Access 1 Video

TL;DR

This paper develops theoretical error bounds for semi-supervised matrix completion leveraging both labeled and unlabeled data, with applications to recommender systems using explicit and implicit feedback.

Contribution

It introduces a novel analysis combining low-rank subspace recovery with generalization bounds for semi-supervised matrix completion, accounting for distributional side information.

Findings

01

Error bounds scale as rac{rac{rac{rac{nd}{M}}}{rac{dr}{N}}

02

Synthetic experiments confirm independent error components for P and R estimation

03

Real-world experiments show improved performance over explicit-only baselines

Abstract

We study a matrix completion problem where both the ground truth $R$ matrix and the unknown sampling distribution $P$ over observed entries are low-rank matrices, and \textit{share a common subspace}. We assume that a large amount $M$ of \textit{unlabeled} data drawn from the sampling distribution $P$ is available, together with a small amount $N$ of labeled data drawn from the same distribution and noisy estimates of the corresponding ground truth entries. This setting is inspired by recommender systems scenarios where the unlabeled data corresponds to `implicit feedback' (consisting in interactions such as purchase, click, etc. ) and the labeled data corresponds to the `explicit feedback', consisting of interactions where the user has given an explicit rating to the item. Leveraging powerful results from the theory of low-rank subspace recovery, together with classic generalization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generalization Bounds for Semi-supervised Matrix Completion with Distributional Side Information· underline

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Tensor decomposition and applications