seqme: a Python library for evaluating biological sequence design
Rasmus M{\o}ller-Larsen, Adam Izdebski, Jan Olszewski, Pankhil Gawade, Michal Kmicikiewicz, Wojciech Zarzecki, Ewa Szczurek

TL;DR
seqme is an open-source Python library that provides a comprehensive set of metrics for evaluating the performance of biological sequence design methods across various sequence types, enhancing reproducibility and analysis.
Contribution
It introduces a modular, extendable library with model-agnostic metrics for assessing biological sequence design, filling a gap in available software tools.
Findings
Supports diverse biological sequences including DNA, RNA, peptides, and proteins.
Includes multiple embedding and property models for evaluation.
Provides diagnostics and visualization tools for analysis.
Abstract
Recent advances in computational methods for designing biological sequences have sparked the development of metrics to evaluate these methods performance in terms of the fidelity of the designed sequences to a target distribution and their attainment of desired properties. However, a single software library implementing these metrics was lacking. In this work we introduce seqme, a modular and highly extendable open-source Python library, containing model-agnostic metrics for evaluating computational methods for biological sequence design. seqme considers three groups of metrics: sequence-based, embedding-based, and property-based, and is applicable to a wide range of biological sequences: small molecules, DNA, ncRNA, mRNA, peptides and proteins. The library offers a number of embedding and property models for biological sequences, as well as diagnostics and visualization functions to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Protein Structure and Dynamics · Machine Learning in Bioinformatics
