Querying Easily Flip-flopped Samples for Deep Active Learning
Seong Jin Cho, Gwangsu Kim, Junghyun Lee, Jinwoo Shin, and Chang D., Yoo

TL;DR
This paper introduces the least disagree metric (LDM), a computationally efficient uncertainty measure for deep active learning, demonstrating state-of-the-art results across various datasets and architectures.
Contribution
The paper proposes the LDM as a novel, asymptotically consistent uncertainty measure for deep active learning, with an efficient implementation method.
Findings
LDM-based active learning achieves state-of-the-art performance.
The estimator for LDM is computationally efficient and easy to implement.
Experimental results validate the effectiveness of LDM across datasets and models.
Abstract
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data. One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is. The sample's distance to the decision boundary is a natural measure of predictive uncertainty, but it is often intractable to compute, especially for complex decision boundaries formed in multiclass classification tasks. To address this issue, this paper proposes the {\it least disagree metric} (LDM), defined as the smallest probability of disagreement of the predicted label, and an estimator for LDM proven to be asymptotically consistent under mild assumptions. The estimator is computationally efficient and can be easily implemented for deep learning models using parameter…
Peer Reviews
Decision·ICLR 2024 poster
* At least in some parts of the paper, the flow is very good, and theorems and conclusions naturally lead to the next part. * The evidence for diversity in LDM-based active learning, as discussed in Appendix E.3, is extraordinary. Through an intuitive first example and then a real-world example (MNIST), the motivation is very clear. If possible, this should definitely be in the main paper to give readers better intuitions. * Ablation studies are presented whenever necessary to justify choices ma
* Minor typo: Page 1, last paragraph: flips-flopped --> flip-flopped * Section 2.2 is rather rushed in its presentation, and details are not expanded on. For example, it is unclear why $\mathcal{H}$ needs to be a Polish space (i.e., why second countability is necessary, for instance). In Theorem 1, $f$ is undefined. In Assumption 3, it seems that the phrase, "that is monotone decreasing in the first argument" refers to $\alpha$, but that is not fully clear. * Similarly, some other details are pr
* The quality of the exposition is high in general. The concepts and ideas are presented by combining an specific and accurate definition along with an intuition on their meaning. It is easy to follow the flow of the paper, and sections are well organized in a natural manner. * The experimental evaluation is a comprehensive one, including a wide range of baselines, datasets and architectures. Results are analyzed in a rigorous way, including statistical tests and popular active learning metric
* I think there exists an important gap between the theoretical description of LDM (its definition in Section 2.1 and its estimator in Section 2.2) and how it is empirically evaluated (Section 2.3). In Section 2.3, the "motivation" paragraph includes several sentences to justify the procedure that the authors are going to follow to empirically evaluate LDM, but these sentences are just somewhat "generic"/"loose", and there is no guarantee that hypothesis in Section 2.2 are satisfied. Taking this
(1) The paper maintains a high-quality presentation. The measure, proof of asymptotical consistency, and algorithms are clearly presented. (2) Extensive experiments on 3 openml datasets and 6 benchmark image datasets.
(1) Many baseline models are not considered in the paper's experiment, e.g., SAAL, Cluster-Margin, Similar, and [4]. (2) Computational cost analysis is lacking, including LDM estimation cost and LDM-S. What is the relationship to batch size, ensemble size, M, etc? The computational cost seems to be comparable to BADGE which has squared complexity to batch size. (3) The authors provide no analysis of the relatedness of LDM to active learner performance in the paper setting. (4) Careful discu
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
MethodsBalanced Selection
