Active Learning with Statistical Models
D. A. Cohn, Z. Ghahramani, M. I. Jordan

TL;DR
This paper reviews optimal data selection methods for machine learning, demonstrating their effectiveness across neural networks, Gaussian mixtures, and locally weighted regression, leading to reduced training data requirements.
Contribution
It extends optimal data selection techniques to Gaussian mixtures and locally weighted regression, showing their efficiency and accuracy compared to neural network methods.
Findings
Optimal data selection reduces training data needed for good performance.
Efficient and accurate data selection techniques for Gaussian mixtures and locally weighted regression.
Neural network methods are computationally expensive and approximate.
Abstract
For many types of machine learning algorithms, one can compute the statistically `optimal' way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Neural Networks and Applications · Fault Detection and Control Systems
