TL;DR
This paper introduces a method called HolUE for uncertainty estimation in open-set text classification, improving rejection of unknown samples across various datasets.
Contribution
It adapts the HolUE method to the text domain and proposes a new benchmark for open-set text classification, demonstrating significant performance gains.
Findings
HolUE achieves 40-365% improvement in Prediction Rejection Ratio over baseline.
Extensive experiments on authorship, intent, and topic datasets validate the approach.
Code and protocols are publicly available at the provided GitHub URL.
Abstract
Accurate uncertainty estimation is essential for building robust and trustworthy recognition systems. In this paper, we consider the open-set text classification (OSTC) task - and uncertainty estimation for it. For OSTC a text sample should be classified as one of the existing classes or rejected as unknown. To account for the different uncertainty types encountered in OSTC, we adapt the Holistic Uncertainty Estimation (HolUE) method for the text domain. Our approach addresses two major causes of prediction errors in text recognition systems: text uncertainty that stems from ill formulated queries and gallery uncertainty that is related the ambiguity of data distribution. By capturing these sources, it becomes possible to predict when the system will make a recognition error. We propose a new OSTC benchmark and conduct extensive experiments on a wide range of data, utilizing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
