Reinforced Data Sampling for Model Diversification
Hoang D. Nguyen, Xuan-Son Vu, Quoc-Tuan Truong, Duc-Trong Le

TL;DR
This paper introduces Reinforced Data Sampling (RDS), a novel method that uses reinforcement learning to select data subsets promoting model diversity, thereby improving performance in machine learning tasks and competitions.
Contribution
The work proposes a reinforcement learning-based data sampling approach that optimizes model diversification and demonstrates its effectiveness across multiple datasets.
Findings
RDS outperforms traditional sampling methods in experiments.
RDS enhances model diversity and learning potential.
The method is applicable to classification and regression tasks.
Abstract
With the rising number of machine learning competitions, the world has witnessed an exciting race for the best algorithms. However, the involved data selection process may fundamentally suffer from evidence ambiguity and concept drift issues, thereby possibly leading to deleterious effects on the performance of various models. This paper proposes a new Reinforced Data Sampling (RDS) method to learn how to sample data adequately on the search for useful models and insights. We formulate the optimisation problem of model diversification in data sampling to maximise learning potentials and optimum allocation by injecting model diversity. This work advocates the employment of diverse base learners as value functions such as neural networks, decision trees, or logistic regressions to reinforce the selection process of data subsets with multi-modal belief. We introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Machine Learning and Algorithms
