Balancing Predictive Relevance of Ligand Biochemical Activities
Marek Pecha

TL;DR
This paper introduces a method to balance the predictive relevance of ligand biochemical activity models using Support Vector Machines and Platt's scaling, addressing issues of dataset imbalance and outliers.
Contribution
It proposes applying Platt's scaling for calibration to improve model relevance balance in ligand activity prediction, a novel approach in this context.
Findings
Effective balancing of model relevance achieved
Demonstrated on datasets from ExCAPE database
Focus on reducing uncertainty with deterministic solvers
Abstract
In this paper, we present a technique for balancing predictive relevance models related to supervised modelling ligand biochemical activities to biological targets. We train uncalibrated models employing conventional supervised machine learning technique, namely Support Vector Machines. Unfortunately, SVMs have a serious drawback. They are sensitive to imbalanced datasets, outliers and high multicollinearity among training samples, which could be a cause of preferencing one group over another. Thus, an additional calibration could be required for balancing a predictive relevance of models. As a technique for this balancing, we propose the Platt's scaling. The achieved results were demonstrated on single-target models trained on datasets exported from the ExCAPE database. Unlike traditional used machine techniques, we focus on decreasing uncertainty employing deterministic solvers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science
