Score Matching With Missing Data

Josh Givens; Song Liu; Henry W J Reeve

arXiv:2506.00557·stat.ML·June 3, 2025

Score Matching With Missing Data

Josh Givens, Song Liu, Henry W J Reeve

PDF

Open Access

TL;DR

This paper extends score matching techniques to handle incomplete data, introducing importance weighting and variational methods that perform well in different data dimensionalities and complexities.

Contribution

It proposes two novel score matching adaptations for missing data, with theoretical bounds and empirical validation across various data scenarios.

Findings

01

IW approach excels in small sample, low-dimensional settings.

02

Variational approach performs best in high-dimensional, complex data.

03

Both methods improve score matching applicability to incomplete datasets.

Abstract

Score matching is a vital tool for learning the distribution of data with applications across many areas including diffusion processes, energy based modelling, and graphical model estimation. Despite all these applications, little work explores its use when data is incomplete. We address this by adapting score matching (and its major extensions) to work with missing data in a flexible setting where data can be partially missing over any subset of the coordinates. We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach. We provide finite sample bounds for our IW approach in finite domain settings and show it to have especially strong performance in small sample lower dimensional cases. Complementing this, we show our variational approach to be strongest in more complex high-dimensional settings which we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTime Series Analysis and Forecasting · Bayesian Modeling and Causal Inference · Data Mining Algorithms and Applications

MethodsDiffusion