Non-ignorable fuzziness in granular counts: the case of RNA-seq data
Antonio Calcagn\`i, Arianna Consiglio, Przemyslaw Grzegorzewski, Corrado Mencar

TL;DR
This paper investigates how read-to-gene alignment ambiguity in RNA-seq data affects count reporting, revealing that fuzzy counts often lead to non-ignorable missing data issues, and proposes a hierarchical model to address this.
Contribution
It introduces a hierarchical model for fuzzy RNA-seq counts that accounts for non-ignorable data coarsening due to alignment ambiguity.
Findings
Fuzzy reporting mechanisms often lead to non-ignorable data coarsening.
Hierarchical model effectively captures the non-ignorable fuzziness in RNA-seq counts.
Application to real RNA-seq data demonstrates the model's utility.
Abstract
RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of latent discrete quantities. We study a class of fuzzy-reporting mechanisms and show that, when reporting exploits graded membership, ignorability fails generically, leading to a coarsening-not-at-random structure. A hierarchical model is then introduced as a tractable instance of this construction and illustrated using RNA-seq data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
