Balancing Predictive Relevance of Ligand Biochemical Activities

Marek Pecha

arXiv:2104.02307·cs.LG·April 7, 2021

Balancing Predictive Relevance of Ligand Biochemical Activities

Marek Pecha

PDF

Open Access

TL;DR

This paper introduces a method to balance the predictive relevance of ligand biochemical activity models using Support Vector Machines and Platt's scaling, addressing issues of dataset imbalance and outliers.

Contribution

It proposes applying Platt's scaling for calibration to improve model relevance balance in ligand activity prediction, a novel approach in this context.

Findings

01

Effective balancing of model relevance achieved

02

Demonstrated on datasets from ExCAPE database

03

Focus on reducing uncertainty with deterministic solvers

Abstract

In this paper, we present a technique for balancing predictive relevance models related to supervised modelling ligand biochemical activities to biological targets. We train uncalibrated models employing conventional supervised machine learning technique, namely Support Vector Machines. Unfortunately, SVMs have a serious drawback. They are sensitive to imbalanced datasets, outliers and high multicollinearity among training samples, which could be a cause of preferencing one group over another. Thus, an additional calibration could be required for balancing a predictive relevance of models. As a technique for this balancing, we propose the Platt's scaling. The achieved results were demonstrated on single-target models trained on datasets exported from the ExCAPE database. Unlike traditional used machine techniques, we focus on decreasing uncertainty employing deterministic solvers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science