Predicting Distance matrix with large language models
Jiaxing Yang

TL;DR
This paper demonstrates that large pretrained RNA language models combined with transformers can accurately predict RNA base distance maps using only primary sequence data, aiding structural understanding.
Contribution
The study introduces a novel approach leveraging pretrained RNA language models and transformers to predict RNA distance maps solely from sequence data.
Findings
Accurate RNA base distance predictions achieved from sequence alone.
Large pretrained models effectively capture structural information.
Method improves RNA structural modeling accuracy.
Abstract
Structural prediction has long been considered critical in RNA research, especially following the success of AlphaFold2 in protein studies, which has drawn significant attention to the field. While recent advances in machine learning and data accumulation have effectively addressed many biological tasks, particularly in protein related research. RNA structure prediction remains a significant challenge due to data limitations. Obtaining RNA structural data is difficult because traditional methods such as nuclear magnetic resonance spectroscopy, Xray crystallography, and electron microscopy are expensive and time consuming. Although several RNA 3D structure prediction methods have been proposed, their accuracy is still limited. Predicting RNA structural information at another level, such as distance maps, remains highly valuable. Distance maps provide a simplified representation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
MethodsSoftmax · Attention Is All You Need
