# Sequence-based prediction of function site and protein-ligand   interaction by a functionally annotated domain profile database

**Authors:** Dengming Ming, Min Han, Xiongbo An

arXiv: 1701.08086 · 2017-01-30

## TL;DR

This paper introduces a sequence-based method utilizing a functionally annotated domain profile database, fiDPD, to predict protein functional sites and protein-ligand interactions with high accuracy, aiding in protein function annotation.

## Contribution

The study presents a novel sequence-based approach for PLI and PFS prediction using fiDPD, a new database built from SCOP, improving prediction accuracy and conservation insights.

## Key findings

- MCC of 0.66 for PFS prediction
- 80% recall for PLI prediction
- PLIs are conserved during protein evolution

## Abstract

Identifying protein functional sites (PFSs) and protein-ligand interactions (PLIs) are critically important in understanding the protein function and the involved biochemical reactions. As large amount of unknown proteins are quickly accumulated in this post-genome era, an urgent task arises to predict PFSs and PLIs at residual level. Nowadays many knowledge-based methods have been well developed for prediction of PFSs, however, accurate methods for PLI prediction are still lacking. In this study, we have presented a new method for prediction of PLIs and PFSs based on sequence of the inquiry protein. The key of the method hinges on a function- and interaction-annotated protein domain profile database, called fiDPD, which was built from the Structural Classification of Proteins (SCOP) database, using a hidden Markov model program. The method was applied to 13 target proteins from the recent Critical Assessment of Structure Prediction (CASP10/11). Our calculations gave a Matthews correlation coefficient (MCC) value of 0.66 for prediction of PFSs, and an 80% recall in prediction of PLIs. Our method reveals that PLIs are conserved during the evolution of proteins, and they can be reliably predicted from fiDPD. fiDPD can be used as a complement to existent bioinformatics tools for protein function annotation.

---
Source: https://tomesphere.com/paper/1701.08086