Probabilistic annotation of protein sequences based on functional   classifications

Emmanuel D. Levy (LMB); Christos A. Ouzounis; Walter R. Gilks,; Benjamin Audit (Phys-ENS)

arXiv:0709.4425·q-bio.QM·September 28, 2007·BMC Bioinform.

Probabilistic annotation of protein sequences based on functional classifications

Emmanuel D. Levy (LMB), Christos A. Ouzounis, Walter R. Gilks,, Benjamin Audit (Phys-ENS)

PDF

TL;DR

This paper introduces a probabilistic method for protein function annotation that directly maps sequences to functional classes using Bayesian approaches, improving accuracy over traditional similarity-based methods.

Contribution

It presents a novel inverse approach to protein annotation, utilizing Bayesian models and correspondence indicators for more accurate functional classification.

Findings

01

Outperforms simple BLAST match transfer in accuracy

02

Provides direct measures of annotation error rates

03

Validated with enzyme databases and specific case analyses

Abstract

BACKGROUND: One of the most evident achievements of bioinformatics is the development of methods that transfer biological knowledge from characterised proteins to uncharacterised sequences. This mode of protein function assignment is mostly based on the detection of sequence similarity and the premise that functional properties are conserved during evolution. Most automatic approaches developed to date rely on the identification of clusters of homologous proteins and the mapping of new proteins onto these clusters, which are expected to share functional characteristics. RESULTS: Here, we inverse the logic of this process, by considering the mapping of sequences directly to a functional classification instead of mapping functions to a sequence clustering. In this mode, the starting point is a database of labelled proteins according to a functional classification scheme, and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.