Multi-Armed Bandit Approach for Optimizing Training on Synthetic Data

Abdulrahman Kerim; Leandro Soriano Marcolino; Erickson R. Nascimento,; Richard Jiang

arXiv:2412.05466·cs.LG·December 10, 2024

Multi-Armed Bandit Approach for Optimizing Training on Synthetic Data

Abdulrahman Kerim, Leandro Soriano Marcolino, Erickson R. Nascimento,, Richard Jiang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a UCB-based training method with a dynamic usability metric to improve synthetic data utilization in supervised learning, significantly boosting classifier performance.

Contribution

It presents a novel UCB-based training procedure and a dynamic usability metric for synthetic data, enhancing model training and data generation processes.

Findings

01

Up to 10% improvement in classification accuracy.

02

Effective ranking of synthetic images based on usability.

03

Enhanced synthetic data generation with attribute-aware bandit pipeline.

Abstract

Supervised machine learning methods require large-scale training datasets to perform well in practice. Synthetic data has been showing great progress recently and has been used as a complement to real data. However, there is yet a great urge to assess the usability of synthetically generated data. To this end, we propose a novel UCB-based training procedure combined with a dynamic usability metric. Our proposed metric integrates low-level and high-level information from synthetic images and their corresponding real and synthetic datasets, surpassing existing traditional metrics. By utilizing a UCB-based dynamic approach ensures continual enhancement of model learning. Unlike other approaches, our method effectively adapts to changes in the machine learning model's state and considers the evolving utility of training samples during the training process. We show that our metric is an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

a-kerim/synthetic-data-usability-2024
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Games and Gamification · Advanced Bandit Algorithms Research

MethodsDiffusion