A flexible model-based framework for robust estimation of mutational signatures
Ragnhild Laursen, Lasse Maretty, Asger Hobolth

TL;DR
This paper introduces a flexible, model-based framework for estimating mutational signatures in cancer, emphasizing di-nucleotide interactions, which improve biological plausibility and data fitting over previous mono- and tri-nucleotide models.
Contribution
It presents a novel EM-based framework for identifying and estimating di-nucleotide interaction mutational signatures, enhancing stability and biological relevance.
Findings
Di-nucleotide signatures are statistically stable.
They fit mutational data better than mono-nucleotide models.
They are more stable and biologically plausible than tri-nucleotide models.
Abstract
Somatic mutations in cancer can be viewed as a mixture distribution of several mutational signatures, which can be inferred using non-negative matrix factorization (NMF). Mutational signatures have previously been parametrized using either simple mono-nucleotide interaction models or general tri-nucleotide interaction models. We describe a flexible and novel framework for identifying biologically plausible parametrizations of mutational signatures, and in particular for estimating di-nucleotide interaction models. The estimation procedure is based on the expectation--maximization (EM) algorithm and regression in the log-linear quasi--Poisson model. We show that di-nucleotide interaction signatures are statistically stable and sufficiently complex to fit the mutational patterns. Di-nucleotide interaction signatures often strike the right balance between appropriately fitting the data and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCancer Genomics and Diagnostics · Gene expression and cancer classification · Bioinformatics and Genomic Networks
