Diffusion Decoding for Peptide De Novo Sequencing

Chi-en Amy Tai; Alexander Wong

arXiv:2507.10955·cs.LG·July 16, 2025

Diffusion Decoding for Peptide De Novo Sequencing

Chi-en Amy Tai, Alexander Wong

PDF

Open Access

TL;DR

This paper explores using diffusion decoders for peptide de novo sequencing, demonstrating that they can significantly improve amino acid recall over traditional autoregressive models, despite some performance challenges.

Contribution

It introduces diffusion decoders adapted for peptide sequencing, showing their potential to improve sensitivity and accuracy over existing autoregressive methods.

Findings

01

Diffusion decoders can enhance amino acid recall in peptide sequencing.

02

Knapsack beam search did not improve performance metrics.

03

The best diffusion decoder with DINOISER loss significantly outperformed the baseline.

Abstract

Peptide de novo sequencing is a method used to reconstruct amino acid sequences from tandem mass spectrometry data without relying on existing protein sequence databases. Traditional deep learning approaches, such as Casanovo, mainly utilize autoregressive decoders and predict amino acids sequentially. Subsequently, they encounter cascading errors and fail to leverage high-confidence regions effectively. To address these issues, this paper investigates using diffusion decoders adapted for the discrete data domain. These decoders provide a different approach, allowing sequence generation to start from any peptide segment, thereby enhancing prediction accuracy. We experiment with three different diffusion decoder designs, knapsack beam search, and various loss functions. We find knapsack beam search did not improve performance metrics and simply replacing the transformer decoder with a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsChemical Synthesis and Analysis

MethodsDiffusion