Chaining thoughts and LLMs to learn DNA structural biophysics

Tyler D. Ross; Ashwin Gopinath

arXiv:2403.01332·q-bio.QM·March 5, 2024·1 cites

Chaining thoughts and LLMs to learn DNA structural biophysics

Tyler D. Ross, Ashwin Gopinath

PDF

Open Access 1 Repo

TL;DR

This paper demonstrates that fine-tuning a large language model like ChatGPT 3.5-turbo with chain-of-thought techniques and task chaining enhances its ability to understand and design DNA structures, advancing AI's role in biophysical research.

Contribution

It introduces a method to adapt a general-purpose LLM for DNA biophysics by fine-tuning with chain-of-thought and subtask chaining techniques.

Findings

01

Fine-tuned models show improved DNA analysis capabilities.

02

Chaining models enhances sequence design accuracy.

03

Chain-of-thought responses aid in biophysical reasoning.

Abstract

The future development of an AI scientist, a tool that is capable of integrating a variety of experimental data and generating testable hypotheses, holds immense potential. So far, bespoke machine learning models have been created to specialize in singular scientific tasks, but otherwise lack the flexibility of a general purpose model. Here, we show that a general purpose large language model, chatGPT 3.5-turbo, can be fine-tuned to learn the structural biophysics of DNA. We find that both fine-tuning models to return chain-of-thought responses and chaining together models fine-tuned for subtasks have an enhanced ability to analyze and design DNA sequences and their structures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tdross/dna-llm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRNA and protein synthesis mechanisms · Genetics, Bioinformatics, and Biomedical Research