Chaining thoughts and LLMs to learn DNA structural biophysics
Tyler D. Ross, Ashwin Gopinath

TL;DR
This paper demonstrates that fine-tuning a large language model like ChatGPT 3.5-turbo with chain-of-thought techniques and task chaining enhances its ability to understand and design DNA structures, advancing AI's role in biophysical research.
Contribution
It introduces a method to adapt a general-purpose LLM for DNA biophysics by fine-tuning with chain-of-thought and subtask chaining techniques.
Findings
Fine-tuned models show improved DNA analysis capabilities.
Chaining models enhances sequence design accuracy.
Chain-of-thought responses aid in biophysical reasoning.
Abstract
The future development of an AI scientist, a tool that is capable of integrating a variety of experimental data and generating testable hypotheses, holds immense potential. So far, bespoke machine learning models have been created to specialize in singular scientific tasks, but otherwise lack the flexibility of a general purpose model. Here, we show that a general purpose large language model, chatGPT 3.5-turbo, can be fine-tuned to learn the structural biophysics of DNA. We find that both fine-tuning models to return chain-of-thought responses and chaining together models fine-tuned for subtasks have an enhanced ability to analyze and design DNA sequences and their structures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Genetics, Bioinformatics, and Biomedical Research
