Safety and Completeness in Flow Decompositions for RNA Assembly
Shahbaz Khan, Milla Kortelainen, Manuel C\'aceres, Lucia, Williams, Alexandru I. Tomescu

TL;DR
This paper introduces a local characterization and practical algorithm for identifying safe paths in flow decompositions of DAGs, significantly improving accuracy and efficiency in RNA transcript analysis.
Contribution
It provides the first local characterization of safe paths in flow decompositions, enabling a practical algorithm that outperforms existing methods in accuracy and computational efficiency.
Findings
Our algorithm reports approximately 50% more coverage than trivial safe algorithms.
It maintains perfect precision while significantly increasing coverage.
The algorithm is 3-5 times faster and uses less space than the greedy-width heuristic.
Abstract
Decomposing a network flow into weighted paths has numerous applications. Some applications require any decomposition that is optimal w.r.t. some property such as number of paths, robustness, or length. Many bioinformatic applications require a specific decomposition where the paths correspond to some underlying data that generated the flow. For real inputs, no optimization criteria guarantees to uniquely identify the correct decomposition. Therefore, we propose to report safe paths, i.e., subpaths of at least one path in every flow decomposition. Ma, Zheng, and Kingsford [WABI 2020] addressed the existence of multiple optimal solutions in a probabilistic framework, i.e., non-identifiability. Later [RECOMB 2021], they gave a quadratic-time algorithm based on a global criterion for solving a problem called AND-Quant, which generalizes the problem of reporting whether a given path is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
