Statistical inference for template-based protein structure prediction
Jian Peng

TL;DR
This paper introduces advanced statistical machine learning methods, including a regression-tree CRF model and a probabilistic consistency approach, to improve the accuracy of protein structure prediction, especially for distantly related proteins with limited evolutionary information.
Contribution
It develops novel statistical models for protein-template alignment, enhancing accuracy in low-homology cases and multi-template integration for better structure prediction.
Findings
Improved alignment accuracy for distantly related proteins.
Effective use of nonlinear scoring functions and structural features.
Enhanced structure prediction through multi-template alignment.
Abstract
Protein structure prediction is one of the most important problems in computational biology. The most successful computational approach, also called template-based modeling, identifies templates with solved crystal structures for the query proteins and constructs three dimensional models based on sequence/structure alignments. Although substantial effort has been made to improve protein sequence alignment, the accuracy of alignments between distantly related proteins is still unsatisfactory. In this thesis, I will introduce a number of statistical machine learning methods to build accurate alignments between a protein sequence and its template structures, especially for proteins having only distantly related templates. For a protein with only one good template, we develop a regression-tree based Conditional Random Fields (CRF) model for pairwise protein sequence/structure alignment. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProtein Structure and Dynamics · Machine Learning in Bioinformatics · Enzyme Structure and Function
