# BIISQ: Bayesian nonparametric discovery of Isoforms and Individual   Specific Quantification

**Authors:** Derek Aguiar, Li-Fang Cheng, Bianca Dumitrascu, Fantine Mordelet,, Athma A Pai, Barbara E Engelhardt

arXiv: 1703.08260 · 2018-05-09

## TL;DR

BIISQ is a Bayesian nonparametric method that accurately discovers and quantifies isoforms from RNA-seq data without needing known references, improving detection especially for complex and lowly expressed isoforms.

## Contribution

It introduces a novel Bayesian nonparametric model with efficient inference for isoform discovery directly from RNA-seq data, outperforming existing methods.

## Key findings

- Superior precision and recall in simulations
- Significant improvement for long and lowly expressed isoforms
- Effective identification of regulatory genetic variants

## Abstract

Most human protein-coding genes can be transcribed into multiple possible distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and exist in tissue- and sample-specific frequencies. Here, we develop BIISQ, a Bayesian nonparametric model to discover Isoforms and Individual Specific Quantification from RNA-seq data. BIISQ does not require known isoform reference sequences but instead estimates isoform composition directly with an isoform catalog shared across samples. We develop a stochastic variational inference approach for efficient and robust posterior inference and demonstrate superior precision and recall for short read RNA-seq simulations and simulated short read data from PacBio long read sequencing when compared to state-of-the-art isoform reconstruction methods. BIISQ achieves the most significant gains for longer (in terms of exons) isoforms and isoforms that are lowly expressed (over 500% more transcripts correctly inferred at low coverage in simulations). Finally, we estimate isoforms in the GEUVADIS RNA-seq data, identify genetic variants that regulate transcript ratios, and demonstrate variant enrichment in functional elements related to mRNA splicing regulation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.08260/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1703.08260/full.md

## References

65 references — full list in the complete paper: https://tomesphere.com/paper/1703.08260/full.md

---
Source: https://tomesphere.com/paper/1703.08260