Leveraging Transformer Models to Capture Multi-Scale Dynamics in Biomolecules by nano-GPT
Wenqi Zeng, Lu Zhang, Yuan Yao

TL;DR
This paper introduces nano-GPT, a transformer-based deep learning model that effectively captures long-term biomolecular dynamics from short simulations, overcoming limitations of previous models like LSTMs.
Contribution
nano-GPT is a novel transformer architecture tailored for modeling complex biomolecular dynamics, employing a two-pass training method to reduce errors and improve long-term predictions.
Findings
Successfully modeled long-timescale dynamics in multiple biomolecular systems
Outperformed traditional models like LSTMs in capturing complex transitions
Demonstrated ability to interpret biomolecular processes through attention mechanisms
Abstract
Long-term biomolecular dynamics are critical for understanding key evolutionary transformations in molecular systems. However, capturing these processes requires extended simulation timescales that often exceed the practical limits of conventional models. To address this, shorter simulations, initialized with diverse perturbations, are commonly used to sample phase space and explore a wide range of behaviors. Recent advances have leveraged language models to infer long-term behavior from short trajectories, but methods such as long short-term memory (LSTM) networks are constrained to low-dimensional reaction coordinates, limiting their applicability to complex systems. In this work, we present nano-GPT, a novel deep learning model inspired by the GPT architecture, specifically designed to capture long-term dynamics in molecular systems with fine-grained conformational states and complex…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
