Sparse Data Diffusion for Scientific Simulations in Biology and Physics
Phil Ostheimer, Mayank Nagda, Andriy Balinskyy, Jean Radig, Carl Herrmann, Stephan Mandt, Marius Kloft, Sophie Fellenz

TL;DR
This paper introduces Sparse Data Diffusion (SDD), a generative model that explicitly handles exact zeros in sparse scientific data, improving fidelity in simulations for biology and physics.
Contribution
The work presents SDD, a novel diffusion approach that models physical sparsity using Sparsity Bits, unifying machine learning generation with physical accuracy.
Findings
SDD outperforms baseline methods in fidelity for sparse data
Empirical validation in physics and biology demonstrates improved pattern capturing
Advances scalable, physically faithful scientific simulations
Abstract
Sparse data is fundamental to scientific simulations in biology and physics, from single-cell gene expression to particle calorimetry, where exact zeros encode physical absence rather than weak signal. However, existing diffusion models lack the physical rigor to faithfully represent this sparsity. This work introduces Sparse Data Diffusion (SDD), a generative method that explicitly models exact zeros via Sparsity Bits, unifying efficient ML generation with physically grounded sparsity handling. Empirical validation in particle physics and single-cell biology demonstrates that SDD achieves higher fidelity than baseline methods in capturing sparse patterns critical for scientific analysis, advancing scalable and physically faithful simulation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Simulation Techniques and Applications · Traffic Prediction and Management Techniques
MethodsDiffusion
