Mixture models for spherical data with applications to protein   bioinformatics

Kanti V. Mardia; Stuart Barber; Philippa M. Burdett; John T.; Kent; Thomas Hamelryck

arXiv:2104.13140·math.ST·April 28, 2021

Mixture models for spherical data with applications to protein bioinformatics

Kanti V. Mardia, Stuart Barber, Philippa M. Burdett, John T., Kent, Thomas Hamelryck

PDF

Open Access

TL;DR

This paper develops exact maximum likelihood mixture models using Kent distributions for spherical data, applied to analyze hydrogen bond geometries in proteins to understand protein folding mechanisms.

Contribution

It introduces an exact MLE approach for Kent mixture models and applies it to protein hydrogen bond data, providing new insights into secondary structure interactions.

Findings

01

Kent mixture models effectively capture hydrogen bond geometry

02

Distinct bond patterns associated with secondary structures

03

Enhanced understanding of protein folding mechanisms

Abstract

Finite mixture models are fitted to spherical data. Kent distributions are used for the components of the mixture because they allow considerable flexibility. Previous work on such mixtures has used an approximate maximum likelihood estimator for the parameters of a single component. However, the approximation causes problems when using the EM algorithm to estimate the parameters in a mixture model. Hence the exact maximum likelihood estimator is used here for the individual components. This paper is motivated by a challenging prize problem in structural bioinformatics of how proteins fold. It is known that hydrogen bonds play a key role in the folding of a protein. We explore this hydrogen bond geometry using a data set describing bonds between two amino acids in proteins. An appropriate coordinate system to represent the hydrogen bond geometry is proposed, with each bond represented…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models