Develop machine learning based predictive models for engineering protein   solubility

X. Han; X. Wang; K. Zhou

arXiv:1806.11369·q-bio.QM·July 23, 2018·1 cites

Develop machine learning based predictive models for engineering protein solubility

X. Han, X. Wang, K. Zhou

PDF

Open Access

TL;DR

This paper develops machine learning models to predict protein solubility as a continuous variable from amino acid sequences, aiding protein engineering and potentially serving as an indirect predictor of protein activity.

Contribution

It introduces a novel approach predicting protein solubility in continuous values, improving upon binary models and achieving 76.28% accuracy with SVM.

Findings

01

Achieved 76.28% prediction accuracy using SVM.

02

Predicted solubility as continuous values enhances protein engineering.

03

Models can indirectly predict protein activity from sequence.

Abstract

Protein activity is a significant characteristic for recombinant proteins which can be used as biocatalysts. High activity of proteins reduces the cost of biocatalysts. A model that can predict protein activity from amino acid sequence is highly desired, as it aids experimental improvement of proteins. However, only limited data for protein activity are currently available, which prevents the development of such models. Since protein activity and solubility are correlated for some proteins, the publicly available solubility dataset may be adopted to develop models that can predict protein solubility from sequence. The models could serve as a tool to indirectly predict protein activity from sequence. In literature, predicting protein solubility from sequence has been intensively explored, but the predicted solubility represented in binary values from all the developed models was not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProtein purification and stability · Protein Structure and Dynamics · Microbial Metabolic Engineering and Bioproduction