SemanticCAP: Chromatin Accessibility Prediction Enhanced by Features Learning from a Language Model
Yikang Zhang, Xiaomin Chu, Yelu Jiang, Hongjie Wu, Lijun Quan

TL;DR
SemanticCAP leverages a gene language model to incorporate contextual sequence information, significantly improving chromatin accessibility prediction accuracy over existing methods.
Contribution
It introduces a novel feature fusion approach that combines language model representations with chromatin accessibility modeling, enhancing prediction performance.
Findings
Outperforms existing models on public benchmarks.
Effective integration of gene sequence context improves predictions.
Demonstrates the importance of contextual information in genomic modeling.
Abstract
A large number of inorganic and organic compounds are able to bind DNA and form complexes, among which drug-related molecules are important. Chromatin accessibility changes not only directly affects drug-DNA interactions, but also promote or inhibit the expression of critical genes associated with drug resistance by affecting the DNA binding capacity of TFs and transcriptional regulators. However, Biological experimental techniques for measuring it are expensive and time consuming. In recent years, several kinds of computational methods have been proposed to identify accessible regions of the genome. Existing computational models mostly ignore the contextual information of bases in gene sequences. To address these issues, we proposed a new solution named SemanticCAP. It introduces a gene language model which models the context of gene sequences, thus being able to provide an effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Genomics and Chromatin Dynamics · Machine Learning in Bioinformatics
