ImitAL: Learning Active Learning Strategies from Synthetic Data
Julius Gonsior, Maik Thiele, Wolfgang Lehner

TL;DR
ImitAL introduces a novel, learned active learning query strategy using imitation learning trained on synthetic data, demonstrating broad applicability, improved performance, and efficiency across diverse datasets and strategies.
Contribution
It presents ImitAL, a new active learning method that encodes query selection as a learning-to-rank problem trained via imitation learning on synthetic data.
Findings
Outperforms 10 state-of-the-art strategies on 15 datasets
More computationally efficient, especially on large datasets
Demonstrates broad domain applicability
Abstract
One of the biggest challenges that complicates applied supervised machine learning is the need for huge amounts of labeled data. Active Learning (AL) is a well-known standard method for efficiently obtaining labeled data by first labeling the samples that contain the most information based on a query strategy. Although many methods for query strategies have been proposed in the past, no clear superior method that works well in general for all domains has been found yet. Additionally, many strategies are computationally expensive which further hinders the widespread use of AL for large-scale annotation projects. We, therefore, propose ImitAL, a novel query strategy, which encodes AL as a learning-to-rank problem. For training the underlying neural network we chose Imitation Learning. The required demonstrative expert experience for training is generated from purely synthetic data. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Algorithms and Data Compression
