Learning optimal Bayesian prior probabilities from data
Ozan Kaan Kayaalp

TL;DR
This paper introduces a machine learning approach to learn optimal Bayesian prior probabilities from data, outperforming traditional uniform priors in text classification tasks.
Contribution
It proposes a novel method for learning priors from data by maximizing a target function, demonstrated through Wikipedia article categorization experiments.
Findings
Study models outperformed baseline models with statistical significance.
Performance improvement up to 443%, average 193%.
Learned priors significantly improved classification accuracy.
Abstract
Noninformative uniform priors are staples of Bayesian inference, especially in Bayesian machine learning. This study challenges the assumption that they are optimal and their use in Bayesian inference yields optimal outcomes. Instead of using arbitrary noninformative uniform priors, we propose a machine learning based alternative method, learning optimal priors from data by maximizing a target function of interest. Applying na\"ive Bayes text classification methodology and a search algorithm developed for this study, our system learned priors from data using the positive predictive value metric as the target function. The task was to find Wikipedia articles that had not (but should have) been categorized under certain Wikipedia categories. We conducted five sets of experiments using separate Wikipedia categories. While the baseline models used the popular Bayes-Laplace priors, the study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Machine Learning and Algorithms · Machine Learning and Data Classification
