ADRS-CNet: An adaptive dimensionality reduction selection and classification network for DNA storage clustering algorithms
Bowen Liu, Jiankun Li

TL;DR
This paper introduces ADRS-CNet, a neural network that adaptively selects the best dimensionality reduction technique for DNA sequence clustering, improving accuracy and mitigating high-dimensional data challenges.
Contribution
The paper presents a novel adaptive model that chooses optimal dimensionality reduction methods for DNA storage data, enhancing clustering performance over traditional fixed techniques.
Findings
Superior classification accuracy of the proposed model.
Significant improvement in clustering results.
Effective mitigation of the curse of dimensionality.
Abstract
DNA storage technology offers new possibilities for addressing massive data storage due to its high storage density, long-term preservation, low maintenance cost, and compact size. To improve the reliability of stored information, base errors and missing storage sequences are challenges that must be faced. Currently, clustering and comparison of sequenced sequences are employed to recover the original sequence information as much as possible. Nonetheless, extracting DNA sequences of different lengths as features leads to the curse of dimensionality, which needs to be overcome. To address this, techniques like PCA, UMAP, and t-SNE are commonly employed to project high-dimensional features into low-dimensional space. Considering that these methods exhibit varying effectiveness in dimensionality reduction when dealing with different datasets, this paper proposes training a multilayer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Gene expression and cancer classification
MethodsBalanced Selection · Principal Components Analysis
