Probabilistic annotation of protein sequences based on functional classifications
Emmanuel D. Levy (LMB), Christos A. Ouzounis, Walter R. Gilks,, Benjamin Audit (Phys-ENS)

TL;DR
This paper introduces a probabilistic method for protein function annotation that directly maps sequences to functional classes using Bayesian approaches, improving accuracy over traditional similarity-based methods.
Contribution
It presents a novel inverse approach to protein annotation, utilizing Bayesian models and correspondence indicators for more accurate functional classification.
Findings
Outperforms simple BLAST match transfer in accuracy
Provides direct measures of annotation error rates
Validated with enzyme databases and specific case analyses
Abstract
BACKGROUND: One of the most evident achievements of bioinformatics is the development of methods that transfer biological knowledge from characterised proteins to uncharacterised sequences. This mode of protein function assignment is mostly based on the detection of sequence similarity and the premise that functional properties are conserved during evolution. Most automatic approaches developed to date rely on the identification of clusters of homologous proteins and the mapping of new proteins onto these clusters, which are expected to share functional characteristics. RESULTS: Here, we inverse the logic of this process, by considering the mapping of sequences directly to a functional classification instead of mapping functions to a sequence clustering. In this mode, the starting point is a database of labelled proteins according to a functional classification scheme, and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
