A Survey on Out-of-Distribution Evaluation of Neural NLP Models

Xinzhe Li; Ming Liu; Shang Gao; Wray Buntine

arXiv:2306.15261·cs.CL·October 15, 2024

A Survey on Out-of-Distribution Evaluation of Neural NLP Models

Xinzhe Li, Ming Liu, Shang Gao, Wray Buntine

PDF

Open Access

TL;DR

This survey reviews out-of-distribution evaluation in neural NLP models, comparing adversarial robustness, domain generalization, and dataset biases, highlighting their differences, evaluation methods, challenges, and future opportunities.

Contribution

It provides a unified comparison and summary of three key research lines in OOD evaluation for neural NLP models, which was lacking in existing literature.

Findings

01

Unified framework for OOD evaluation in NLP

02

Comparison of data-generating processes and protocols

03

Identification of challenges and future directions

Abstract

Adversarial robustness, domain generalization and dataset biases are three active lines of research contributing to out-of-distribution (OOD) evaluation on neural NLP models. However, a comprehensive, integrated discussion of the three research lines is still lacking in the literature. In this survey, we 1) compare the three lines of research under a unifying definition; 2) summarize the data-generating processes and evaluation protocols for each line of research; and 3) emphasize the challenges and opportunities for future work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Explainable Artificial Intelligence (XAI)