Deep Learning and Random Forest-Based Augmentation of sRNA Expression   Profiles

Jelena Fiosina; Maksims Fiosins; Stefan Bonn

arXiv:1909.11943·q-bio.GN·September 27, 2019

Deep Learning and Random Forest-Based Augmentation of sRNA Expression Profiles

Jelena Fiosina, Maksims Fiosins, Stefan Bonn

PDF

TL;DR

This paper explores deep learning and random forest methods to automatically augment small RNA-seq data annotations, significantly improving accuracy and handling unseen datasets better than traditional text mining approaches.

Contribution

It formulates annotation augmentation as a classification problem and demonstrates high accuracy in tissue and sex prediction using DL and RF methods.

Findings

01

Deep learning achieves up to 98% accuracy in tissue annotation.

02

DL outperforms RF in classification tasks.

03

The approach improves annotation quality for unseen datasets.

Abstract

The lack of well-structured annotations in a growing amount of RNA expression data complicates data interoperability and reusability. Commonly - used text mining methods extract annotations from existing unstructured data descriptions and often provide inaccurate output that requires manual curation. Automatic data-based augmentation (generation of annotations on the base of expression data) can considerably improve the annotation quality and has not been well-studied. We formulate an automatic augmentation of small RNA-seq expression data as a classification problem and investigate deep learning (DL) and random forest (RF) approaches to solve it. We generate tissue and sex annotations from small RNA-seq expression data for tissues and cell lines of homo sapiens. We validate our approach on 4243 annotated small RNA-seq samples from the Small RNA Expression Atlas (SEA) database. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.