Protein FID: Improved Evaluation of Protein Structure Generative Models
Felix Faltings, Hannes Stark, Tommi Jaakkola, Regina Barzilay

TL;DR
This paper introduces Protein FID, a new metric for evaluating protein structure generative models based on distributional similarity in a meaningful latent space, addressing limitations of existing metrics.
Contribution
The paper proposes the Protein FID metric, which better captures the distributional quality of generated protein structures compared to existing evaluation methods.
Findings
Protein FID correlates with optimal transport distances.
It recovers FoldSeek clusters and CATH hierarchy.
Current models underperform according to Protein FID.
Abstract
Protein structure generative models have seen a recent surge of interest, but meaningfully evaluating them computationally is an active area of research. While current metrics have driven useful progress, they do not capture how well models sample the design space represented by the training data. We argue for a protein Frechet Inception Distance (FID) metric to supplement current evaluations with a measure of distributional similarity in a semantically meaningful latent space. Our FID behaves desirably under protein structure perturbations and correctly recapitulates similarities between protein samples: it correlates with optimal transport distances and recovers FoldSeek clusters and the CATH hierarchy. Evaluating current protein structure generative models with FID shows that they fall short of modeling the distribution of PDB proteins.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
