# ConoGPT: Fine-Tuning a Protein Language Model by Incorporating Disulfide Bond Information for Conotoxin Sequence Generation

**Authors:** Guohui Zhao, Cheng Ge, Wenzheng Han, Rilei Yu, Hao Liu

PMC · DOI: 10.3390/toxins17020093 · Toxins · 2025-02-17

## TL;DR

ConoGPT is a new model for generating conotoxin sequences by incorporating disulfide bond information, improving the design of these valuable peptides for drug development.

## Contribution

The novel integration of disulfide bond information into a protein language model for enhanced conotoxin sequence generation.

## Key findings

- ConoGPT-generated sequences show high consistency with authentic conotoxins in physicochemical properties.
- Sequences generated by ConoGPT outperform others in producing ordered structures and show strong binding affinities to nAChR targets.
- Molecular dynamics simulations confirm the stability of generated sequences when bound to their targets.

## Abstract

Conotoxins are a class of peptide toxins secreted by marine mollusks of the Conus genus, characterized by their unique mechanism of action and significant biological activity, making them highly valuable for drug development. However, traditional methods of acquiring conotoxins, such as in vivo extraction or chemical synthesis, face challenges of high costs, long cycles, and limited exploration of sequence diversity. To address these issues, we propose the ConoGPT model, a conotoxin sequence generation model that fine-tunes the ProtGPT2 model by incorporating disulfide bond information. Experimental results demonstrate that sequences generated by ConoGPT exhibit high consistency with authentic conotoxins in physicochemical properties and show considerable potential for generating novel conotoxins. Furthermore, compared to models without disulfide bond information, ConoGPT outperforms in terms of generating sequences with ordered structures. The majority of the filtered sequences were shown to possess significant binding affinities to nicotinic acetylcholine receptor (nAChR) targets based on molecular docking. Molecular dynamics simulations of the selected sequences further confirmed the dynamic stability of the generated sequences in complex with their respective targets. This study not only provides a new technological approach for conotoxin design but also offers a novel strategy for generating functional peptides.

## Linked entities

- **Species:** Conus (taxon 6490)

## Full-text entities

- **Genes:** CHRNA4 (cholinergic receptor nicotinic alpha 4 subunit) [NCBI Gene 1137] {aka BFNC, EBN, EBN1, NACHR, NACHRA4, NACRA4}
- **Chemicals:** Disulfide (MESH:D004220)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11860916/full.md

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11860916/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC11860916/full.md

---
Source: https://tomesphere.com/paper/PMC11860916