Don't guess what's true: choose what's optimal. A probability transducer for machine-learning classifiers
K. Dyrland, A. S. Lundervold, P.G.L. Porta Mana

TL;DR
This paper introduces a probability transducer that converts classifier outputs into reliable probabilities, enabling optimal decision-making in critical applications like medicine and drug discovery, especially with imbalanced datasets.
Contribution
It proposes a computationally efficient method to derive class probabilities from classifier outputs, facilitating decision-theoretic optimization in complex, real-world problems.
Findings
Transducer improves decision accuracy in drug discovery tasks.
Method achieves near-optimal results across various utility settings.
Provides uncertainty quantification and supports biased dataset scenarios.
Abstract
In fields such as medicine and drug discovery, the ultimate goal of a classification is not to guess a class, but to choose the optimal course of action among a set of possible ones, usually not in one-one correspondence with the set of classes. This decision-theoretic problem requires sensible probabilities for the classes. Probabilities conditional on the features are computationally almost impossible to find in many important cases. The main idea of the present work is to calculate probabilities conditional not on the features, but on the trained classifier's output. This calculation is cheap, needs to be made only once, and provides an output-to-probability "transducer" that can be applied to all future outputs of the classifier. In conjunction with problem-dependent utilities, the probabilities of the transducer allow us to find the optimal choice among the classes or among a set…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Statistical and Computational Modeling
