# Enhancing privacy in biosecurity with watermarked protein design

**Authors:** Yanshuo Chen, Zhengmian Hu, Yihan Wu, Ruibo Chen, Yongrui Jin, Marcus Zhan, Chengjin Xie, Wei Chen, Heng Huang

PMC · DOI: 10.1093/bioinformatics/btaf141 · Bioinformatics · 2025-05-02

## TL;DR

This paper introduces a method to add watermarks to protein sequences to improve biosecurity and protect data privacy.

## Contribution

A novel framework for watermarking protein sequences using deep-learning models to ensure traceability and privacy.

## Key findings

- The proposed watermarking framework achieves robust traceability while maintaining sequence privacy.
- The method significantly improves watermark detection efficiency compared to existing techniques.
- The framework allows researchers to embed their identity into protein sequences for intellectual property claims.

## Abstract

The biosecurity issue arises as the capability of deep-learning-based protein design has rapidly increased in recent years. Current regulation procedures for DNA synthesizing focus on the biosecurity but ignore the data privacy.

We propose a general framework for adding watermarks to protein sequences designed by various autoregressive deep-learning models. Compared to current regulation procedures, watermarks also ensure robust traceability to achieve biosecurity but maintain privacy of designed sequences by local verification. Benchmarked with other watermarking techniques, the watermark detection efficiency of our method is substantially increased to be more practical in real-world scenarios. Moreover, it provides a convenient way for researchers to claim their own intellectual property since the designer’s information could be embedded into the sequence with our framework.

The implementation of the protein watermark framework is freely available to noncommercial users at https://github.com/poseidonchan/ProteinWatermark.

## Full-text entities

- **Genes:** DBI (diazepam binding inhibitor, acyl-CoA binding protein) [NCBI Gene 1622] {aka ACBD1, ACBP, CCK-RP, EP}
- **Chemicals:** glycine (MESH:D005998), amino acid (MESH:D000596), alanine (MESH:D000409)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12279293/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12279293/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/PMC12279293/full.md

---
Source: https://tomesphere.com/paper/PMC12279293