Exact Paired-Permutation Testing for Structured Test Statistics

Ran Zmigrod; Tim Vieira; Ryan Cotterell

arXiv:2205.01416·cs.CL·May 5, 2022

Exact Paired-Permutation Testing for Structured Test Statistics

Ran Zmigrod, Tim Vieira, Ryan Cotterell

PDF

Open Access 1 Repo

TL;DR

This paper introduces an efficient exact algorithm for paired-permutation tests in NLP, enabling faster and more accurate significance testing of system performance differences compared to traditional Monte Carlo methods.

Contribution

The paper presents a novel exact algorithm for paired-permutation testing of structured test statistics, significantly improving speed over Monte Carlo approximations.

Findings

01

Exact algorithm is 10x faster than Monte Carlo with 20,000 samples

02

Algorithm runs in O(GN (log GN)(log N)) time, scalable to large datasets

03

Provides reliable significance testing for NLP system performance differences

Abstract

Significance testing -- especially the paired-permutation test -- has played a vital role in developing NLP systems to provide confidence that the difference in performance between two systems (i.e., the test statistic) is not due to luck. However, practitioners rely on Monte Carlo approximation to perform this test due to a lack of a suitable exact algorithm. In this paper, we provide an efficient exact algorithm for the paired-permutation test for a family of structured test statistics. Our algorithm runs in $O (GN (lo g GN) (lo g N))$ time where $N$ is the dataset size and $G$ is the range of the test statistic. We found that our exact algorithm was $10$ x faster than the Monte Carlo approximation with $20000$ samples on a common dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rycolab/paired-perm-test
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms