Investigating Selective Prediction Approaches Across Several Tasks in   IID, OOD, and Adversarial Settings

Neeraj Varshney; Swaroop Mishra; and Chitta Baral

arXiv:2203.00211·cs.CL·March 2, 2022

Investigating Selective Prediction Approaches Across Several Tasks in IID, OOD, and Adversarial Settings

Neeraj Varshney, Swaroop Mishra, and Chitta Baral

PDF

Open Access

TL;DR

This study systematically evaluates various selective prediction methods across 17 NLP datasets in IID, OOD, and adversarial settings, revealing that most do not outperform the simple MaxProb baseline consistently and highlighting the need for cross-task evaluation.

Contribution

It provides a large-scale, comprehensive comparison of selective prediction approaches across multiple NLP tasks and settings, emphasizing the importance of cross-task evaluation for reliable assessment.

Findings

01

Most approaches do not consistently outperform MaxProb.

02

Performance varies significantly across tasks and settings.

03

Evaluation across multiple tasks is essential for reliable assessment.

Abstract

In order to equip NLP systems with selective prediction capability, several task-specific approaches have been proposed. However, which approaches work best across tasks or even if they consistently outperform the simplest baseline 'MaxProb' remains to be explored. To this end, we systematically study 'selective prediction' in a large-scale setup of 17 datasets across several NLP tasks. Through comprehensive experiments under in-domain (IID), out-of-domain (OOD), and adversarial (ADV) settings, we show that despite leveraging additional resources (held-out data/computation), none of the existing approaches consistently and considerably outperforms MaxProb in all three settings. Furthermore, their performance does not translate well across tasks. For instance, Monte-Carlo Dropout outperforms all other approaches on Duplicate Detection datasets but does not fare well on NLI datasets,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications

MethodsDropout