Count Bridges enable Modeling and Deconvolving Transcriptomic Data
Nic Fishman, Gokul Gowri, Tanush Kumar, Jiaqi Lu, Valentin de Bortoli, Jonathan S. Gootenberg, Omar Abudayyeh

TL;DR
Count Bridges introduces a novel stochastic process for modeling and deconvolving integer-valued biological count data, enabling accurate single-cell resolution analysis from aggregated measurements with state-of-the-art performance.
Contribution
The paper presents Count Bridges, a new tractable stochastic bridge process for count data, extending diffusion models to integer-valued data and enabling direct training from aggregated measurements.
Findings
State-of-the-art performance on integer distribution benchmarks
Effective deconvolution of bulk RNA-seq data
Successful resolution of spatial transcriptomic spots into single-cell profiles
Abstract
Many modern biological assays, including RNA sequencing, yield integer-valued counts that reflect the number of molecules detected. These measurements are often not at the desired resolution: while the unit of interest is typically a single cell, many measurement technologies produce counts aggregated over sets of cells. Although recent generative frameworks such as diffusion and flow matching have been extended to non-Euclidean and discrete settings, it remains unclear how best to model integer-valued data or how to systematically deconvolve aggregated observations. We introduce Count Bridges, a stochastic bridge process on the integers that provides an exact, tractable analogue of diffusion-style models for count data, with closed-form conditionals for efficient training and sampling. We extend this framework to enable direct training from aggregated measurements via an…
Peer Reviews
Decision·ICLR 2026 Poster
This is one of the best papers I have read this year. The only reason I did not reward the submission with the highest presentation score is because some concepts were not explained/taken for granted (I discuss this in Weaknesses). The originality and quality of the paper is top-class, especially considering the very clever use of the EM algorithm to solve the deconvolution problem. Furthermore, modeling probability bridges directly on the space of count data is of great importance in genomics
### Distinction to SBs Why should Count Bridges not be considered as an instance of Schrödinger Bridges? SBs are not constrained to continuous measures, and there is no statement about why CBs are not SBs (there is instead a note on the connection between SBs and CBs in line 172). I think this requires some clarity to avoid the risk of artificially distancing CBs from SBs in order to promote novelty. ### Clarity Surprisingly, I was mainly concerned with the clarity in Sec. 3. I found that key
1. Introduces an integer-native diffusion process (birth–death Poisson kernel) that naturally preserves counts. 2. Derives analytic intermediate conditionals (Binomial / Hypergeometric / Bessel), which support principled local-bridge sampling. 3. Practical deconvolution pipeline: Gives a workable EM-style approach (projection-guided diffusion + aggregate-level loss) to infer unit-level counts from aggregate observations. 4. Broad evaluation: Tests on synthetic and multiple real-world deconvol
1. Identifiability and aggregation scale. Pure aggregate supervision is intrinsically ill-posed; performance can degrade as group size increases or between-unit heterogeneity decreases. Please provide quantitative sensitivity analyses showing performance vs. group size and vs. within-group heterogeneity (e.g., varying variance of unit distributions), and explain the practical limits (aggregation scales) where the proposed EM is reliable. 2. For the nucleotide-level gene expression modeling tas
Count Bridges constructs a birth–death bridge with a closed-form conditional distribution, enabling precise likelihood estimation for integer data. The EM-style extension for transcriptomics holds practical value: it provides an actionable workflow for deconvoluting single-cell distributions from aggregated observations derived from bulk or spatial sequencing. Experimentally, the method demonstrates robust improvements over relevant discrete baselines across synthetic and real-world data tasks p
- While being compared against CFM, DFM, and some biological baselines, a direct comparison with other recent count - specific or general discrete diffusion models (beyond Blackout Diffusion) on the proposed tasks could provide a more complete picture. - The computational complexity of Count Bridges versus discrete diffusion or flow models is not reported; scalability to large transcriptomic datasets is unclear. - While deconvolution results are promising, the paper provides limited biological c
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Gene expression and cancer classification · Machine Learning and Algorithms
