Exact Paired-Permutation Testing for Structured Test Statistics
Ran Zmigrod, Tim Vieira, Ryan Cotterell

TL;DR
This paper introduces an efficient exact algorithm for paired-permutation tests in NLP, enabling faster and more accurate significance testing of system performance differences compared to traditional Monte Carlo methods.
Contribution
The paper presents a novel exact algorithm for paired-permutation testing of structured test statistics, significantly improving speed over Monte Carlo approximations.
Findings
Exact algorithm is 10x faster than Monte Carlo with 20,000 samples
Algorithm runs in O(GN (log GN)(log N)) time, scalable to large datasets
Provides reliable significance testing for NLP system performance differences
Abstract
Significance testing -- especially the paired-permutation test -- has played a vital role in developing NLP systems to provide confidence that the difference in performance between two systems (i.e., the test statistic) is not due to luck. However, practitioners rely on Monte Carlo approximation to perform this test due to a lack of a suitable exact algorithm. In this paper, we provide an efficient exact algorithm for the paired-permutation test for a family of structured test statistics. Our algorithm runs in time where is the dataset size and is the range of the test statistic. We found that our exact algorithm was x faster than the Monte Carlo approximation with samples on a common dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Machine Learning and Algorithms
