Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption
Wei Ma, George H. Chen

TL;DR
This paper introduces a novel method for estimating missingness probabilities in MNAR matrix completion by leveraging low nuclear norm structures, improving bias correction and prediction accuracy without strong assumptions.
Contribution
It proposes a simple, effective approach to estimate missingness probabilities using nuclear norm constraints, avoiding assumptions of logistic regression or naive Bayes, with theoretical error bounds.
Findings
Improved matrix completion accuracy over baseline methods.
Finite-sample error bounds for probability estimates.
Debiasing enhances various existing algorithms.
Abstract
Matrix completion is often applied to data with entries missing not at random (MNAR). For example, consider a recommendation system where users tend to only reveal ratings for items they like. In this case, a matrix completion method that relies on entries being revealed at uniformly sampled row and column indices can yield overly optimistic predictions of unseen user ratings. Recently, various papers have shown that we can reduce this bias in MNAR matrix completion if we know the probabilities of different matrix entries being missing. These probabilities are typically modeled using logistic regression or naive Bayes, which make strong assumptions and lack guarantees on the accuracy of the estimated probabilities. In this paper, we suggest a simple approach to estimating these probabilities that avoids these shortcomings. Our approach follows from the observation that missingness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Bayesian Modeling and Causal Inference · Statistical Methods and Bayesian Inference
MethodsLogistic Regression
