Non-ignorable fuzziness in granular counts: the case of RNA-seq data

Antonio Calcagn\`i; Arianna Consiglio; Przemyslaw Grzegorzewski; Corrado Mencar

arXiv:2604.00763·stat.ME·May 6, 2026

Non-ignorable fuzziness in granular counts: the case of RNA-seq data

Antonio Calcagn\`i, Arianna Consiglio, Przemyslaw Grzegorzewski, Corrado Mencar

PDF

TL;DR

This paper investigates how read-to-gene alignment ambiguity in RNA-seq data affects count reporting, revealing that fuzzy counts often lead to non-ignorable missing data issues, and proposes a hierarchical model to address this.

Contribution

It introduces a hierarchical model for fuzzy RNA-seq counts that accounts for non-ignorable data coarsening due to alignment ambiguity.

Findings

01

Fuzzy reporting mechanisms often lead to non-ignorable data coarsening.

02

Hierarchical model effectively captures the non-ignorable fuzziness in RNA-seq counts.

03

Application to real RNA-seq data demonstrates the model's utility.

Abstract

RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of latent discrete quantities. We study a class of fuzzy-reporting mechanisms and show that, when reporting exploits graded membership, ignorability fails generically, leading to a coarsening-not-at-random structure. A hierarchical model is then introduced as a tractable instance of this construction and illustrated using RNA-seq data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.