Machine Learning Algorithm for Noise Reduction and Disease-Causing Gene Feature Extraction in Gene Sequencing Data
Weichen Si, Yihao Ou, Zhen Tian

TL;DR
This paper introduces DeepSeqDenoise, a machine learning method combining CNN and RNN for noise reduction and gene feature extraction, achieving high accuracy in identifying disease-causing genes from sequencing data.
Contribution
The study presents a novel deep learning approach that enhances noise reduction and gene prediction accuracy in gene sequencing analysis.
Findings
Signal-to-noise ratio improved by 9.4 dB
94.3% accuracy in disease gene prediction
Identified 57 new candidate disease-causing genes
Abstract
In this study, we propose a machine learning-based method for noise reduction and disease-causing gene feature extraction in gene sequencing DeepSeqDenoise algorithm combines CNN and RNN to effectively remove the sequencing noise, and improves the signal-to-noise ratio by 9.4 dB. We screened 17 key features by feature engineering, and constructed an integrated learning model to predict disease-causing genes with 94.3% accuracy. We successfully identified 57 new candidate disease-causing genes in a cardiovascular disease cohort validation, and detected 3 missed variants in clinical applications. The method significantly outperforms existing tools and provides strong support for accurate diagnosis of genetic diseases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genetics, Bioinformatics, and Biomedical Research · Machine Learning in Bioinformatics
