Comparative Evaluation of Applicability Domain Definition Methods for Regression Models
Shakir Khurshid, Bharath Kumar Loganathan, Matthieu Duvinage

TL;DR
This paper evaluates various methods for defining the applicability domain of regression models, introduces a novel Bayesian neural network approach, and benchmarks their performance across multiple datasets to improve prediction reliability.
Contribution
It presents a comprehensive benchmark of eight applicability domain methods and proposes a new Bayesian neural network-based approach for more accurate applicability domain detection.
Findings
The Bayesian neural network approach outperformed existing methods in accuracy.
Benchmark results show variability in effectiveness across different datasets.
The proposed method enhances the reliability of model predictions.
Abstract
The applicability domain refers to the range of data for which the prediction of the predictive model is expected to be reliable and accurate and using a model outside its applicability domain can lead to incorrect results. The ability to define the regions in data space where a predictive model can be safely used is a necessary condition for having safer and more reliable predictions to assure the reliability of new predictions. However, defining the applicability domain of a model is a challenging problem, as there is no clear and universal definition or metric for it. This work aims to make the applicability domain more quantifiable and pragmatic. Eight applicability domain detection techniques were applied to seven regression models, trained on five different datasets, and their performance was benchmarked using a validation framework. We also propose a novel approach based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
