Reinforced Data Sampling for Model Diversification

Hoang D. Nguyen; Xuan-Son Vu; Quoc-Tuan Truong; Duc-Trong Le

arXiv:2006.07100·cs.LG·June 15, 2020·5 cites

Reinforced Data Sampling for Model Diversification

Hoang D. Nguyen, Xuan-Son Vu, Quoc-Tuan Truong, Duc-Trong Le

PDF

Open Access 1 Repo

TL;DR

This paper introduces Reinforced Data Sampling (RDS), a novel method that uses reinforcement learning to select data subsets promoting model diversity, thereby improving performance in machine learning tasks and competitions.

Contribution

The work proposes a reinforcement learning-based data sampling approach that optimizes model diversification and demonstrates its effectiveness across multiple datasets.

Findings

01

RDS outperforms traditional sampling methods in experiments.

02

RDS enhances model diversity and learning potential.

03

The method is applicable to classification and regression tasks.

Abstract

With the rising number of machine learning competitions, the world has witnessed an exciting race for the best algorithms. However, the involved data selection process may fundamentally suffer from evidence ambiguity and concept drift issues, thereby possibly leading to deleterious effects on the performance of various models. This paper proposes a new Reinforced Data Sampling (RDS) method to learn how to sample data adequately on the search for useful models and insights. We formulate the optimisation problem of model diversification $δ - d i v$ in data sampling to maximise learning potentials and optimum allocation by injecting model diversity. This work advocates the employment of diverse base learners as value functions such as neural networks, decision trees, or logistic regressions to reinforce the selection process of data subsets with multi-modal belief. We introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

probeu/RDS
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Machine Learning and Algorithms