Reliability Testing for Natural Language Processing Systems

Samson Tan; Shafiq Joty; Kathy Baxter; Araz Taeihagh; Gregory A.; Bennett; Min-Yen Kan

arXiv:2105.02590·cs.LG·June 2, 2021

Reliability Testing for Natural Language Processing Systems

Samson Tan, Shafiq Joty, Kathy Baxter, Araz Taeihagh, Gregory A., Bennett, Min-Yen Kan

PDF

TL;DR

This paper emphasizes the importance of reliability testing in NLP systems, proposing a framework that uses adversarial attacks to evaluate fairness and robustness, aiming to improve accountability and standards.

Contribution

It introduces a novel framework for reliability testing in NLP, integrating adversarial attacks and interdisciplinary approaches to enhance system evaluation.

Findings

01

Reliability testing can identify fairness and robustness issues.

02

Adversarial attacks can be reframed for reliability assessment.

03

Interdisciplinary collaboration enhances testing effectiveness.

Abstract

Questions of fairness, robustness, and transparency are paramount to address before deploying NLP systems. Central to these concerns is the question of reliability: Can NLP systems reliably treat different demographics fairly and function correctly in diverse and noisy environments? To address this, we argue for the need for reliability testing and contextualize it among existing work on improving accountability. We show how adversarial attacks can be reframed for this goal, via a framework for developing reliability tests. We argue that reliability testing -- with an emphasis on interdisciplinary collaboration -- will enable rigorous and targeted testing, and aid in the enactment and enforcement of industry standards.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.