Better Aggregation in Test-Time Augmentation

Divya Shanmugam; Davis Blalock; Guha Balakrishnan; John Guttag

arXiv:2011.11156·cs.CV·October 13, 2021

Better Aggregation in Test-Time Augmentation

Divya Shanmugam, Davis Blalock, Guha Balakrishnan, John Guttag

PDF

TL;DR

This paper analyzes the limitations of simple averaging in test-time augmentation for image classification and introduces a learning-based aggregation method that improves accuracy across various models and datasets.

Contribution

It provides experimental insights into when simple averaging fails and proposes a novel learning-based aggregation approach for test-time augmentation.

Findings

01

Learning-based aggregation outperforms simple averaging.

02

Test-time augmentation can sometimes reduce overall accuracy.

03

The method is effective across multiple models and datasets.

Abstract

Test-time augmentation -- the aggregation of predictions across transformed versions of a test input -- is a common practice in image classification. Traditionally, predictions are combined using a simple average. In this paper, we present 1) experimental analyses that shed light on cases in which the simple average is suboptimal and 2) a method to address these shortcomings. A key finding is that even when test-time augmentation produces a net improvement in accuracy, it can change many correct predictions into incorrect predictions. We delve into when and why test-time augmentation changes a prediction from being correct to incorrect and vice versa. Building on these insights, we present a learning-based method for aggregating test-time augmentations. Experiments across a diverse set of models, datasets, and augmentations show that our method delivers consistent improvements over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.