CountsDiff: A Diffusion Model on the Natural Numbers for Generation and Imputation of Count-Based Data
Renzo G. Soatto, Anders Hoel, Greycen Ren, Shorna Alam, Stephen Bates, Nikolaos P. Daskalakis, Caroline Uhler, Maria Skoularidou

TL;DR
CountsDiff is a novel diffusion model tailored for count-based data, extending existing frameworks with new features and demonstrating strong performance on image and biological datasets.
Contribution
It introduces a direct parameterization and new features like continuous-time training and non-monotone reverse dynamics for count data modeling.
Findings
CountsDiff matches or surpasses state-of-the-art models in RNA-seq imputation.
The model performs well on natural image datasets like CIFAR-10 and CelebA.
Design parameters significantly influence model performance and flexibility.
Abstract
Diffusion models have excelled at generative tasks for both continuous and token-based domains, but their application to discrete ordinal data remains underdeveloped. We present CountsDiff, a diffusion framework designed to natively model distributions on the natural numbers. CountsDiff extends the Blackout diffusion framework by simplifying its formulation through a direct parameterization in terms of a survival probability schedule and an explicit loss weighting. This introduces flexibility through design parameters with direct analogues in existing diffusion modeling frameworks. Beyond this reparameterization, CountsDiff introduces features from modern diffusion models, previously absent in counts-based domains, including continuous-time training, classifier-free guidance, and churn/remasking reverse dynamics that allow non-monotone reverse trajectories. We propose an initial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
