Adversarial Evaluation for Models of Natural Language
Noah A. Smith

TL;DR
This paper introduces an adversarial evaluation framework for NLP models, aiming to clarify evaluation roles, improve error analysis, and enhance the assessment of both supervised and unsupervised language models.
Contribution
It proposes a novel abstract framework for adversarial evaluation in NLP, explicitly defining roles and encouraging comprehensive error analysis to better understand model performance.
Findings
Framework can simulate intrinsic and extrinsic evaluations
Encourages earlier and more detailed error analysis
Clarifies roles in the evaluation process
Abstract
We now have a rich and growing set of modeling tools and algorithms for inducing linguistic structure from text that is less than fully annotated. In this paper, we discuss some of the weaknesses of our current methodology. We present a new abstract framework for evaluating natural language processing (NLP) models in general and unsupervised NLP models in particular. The central idea is to make explicit certain adversarial roles among researchers, so that the different roles in an evaluation are more clearly defined and performers of all roles are offered ways to make measurable contributions to the larger goal. Adopting this approach may help to characterize model successes and failures by encouraging earlier consideration of error analysis. The framework can be instantiated in a variety of ways, simulating some familiar intrinsic and extrinsic evaluations as well as some new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
