Out-of-Distribution Generalization in Text Classification: Past,   Present, and Future

Linyi Yang; Yaoxiao Song; Xuan Ren; Chenyang Lyu; Yidong Wang,; Lingqiao Liu; Jindong Wang; Jennifer Foster; Yue Zhang

arXiv:2305.14104·cs.CL·May 24, 2023·1 cites

Out-of-Distribution Generalization in Text Classification: Past, Present, and Future

Linyi Yang, Yaoxiao Song, Xuan Ren, Chenyang Lyu, Yidong Wang,, Lingqiao Liu, Jindong Wang, Jennifer Foster, Yue Zhang

PDF

Open Access

TL;DR

This paper provides a comprehensive review of out-of-distribution generalization challenges, methods, and evaluations in NLP text classification, highlighting gaps and future directions to improve model robustness.

Contribution

It offers the first extensive survey on OOD generalization in NLP text classification, summarizing recent progress and identifying key challenges and future research avenues.

Findings

01

Highlights the importance of robustness to OOD data in NLP

02

Summarizes recent methods and evaluations for OOD generalization

03

Identifies gaps and proposes future research directions

Abstract

Machine learning (ML) systems in natural language processing (NLP) face significant challenges in generalizing to out-of-distribution (OOD) data, where the test distribution differs from the training data distribution. This poses important questions about the robustness of NLP models and their high accuracy, which may be artificially inflated due to their underlying sensitivity to systematic biases. Despite these challenges, there is a lack of comprehensive surveys on the generalization challenge from an OOD perspective in text classification. Therefore, this paper aims to fill this gap by presenting the first comprehensive review of recent progress, methods, and evaluations on this topic. We furth discuss the challenges involved and potential future research directions. By providing quick access to existing work, we hope this survey will encourage future research in this area.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text and Document Classification Technologies · Machine Learning and Data Classification

MethodsTest