Learning to Resize Images for Computer Vision Tasks

Hossein Talebi; Peyman Milanfar

arXiv:2103.09950·cs.CV·August 19, 2021

Learning to Resize Images for Computer Vision Tasks

Hossein Talebi, Peyman Milanfar

PDF

3 Repos

TL;DR

This paper introduces learned image resizers trained jointly with vision models, significantly improving classification accuracy on ImageNet by replacing traditional resizing methods with CNN-based learned resizers.

Contribution

The paper proposes a novel CNN-based learned image resizer that enhances task performance over traditional resizers, and demonstrates its effectiveness across multiple vision tasks and models.

Findings

01

Learned resizers outperform traditional bilinear/bicubic resizers in classification accuracy.

02

Joint training of resizer and vision model leads to consistent performance improvements.

03

Learned resizers are adaptable to different models and tasks, including fine-tuning for other vision applications.

Abstract

For all the ways convolutional neural nets have revolutionized computer vision in recent years, one important aspect has received surprisingly little attention: the effect of image size on the accuracy of tasks being trained for. Typically, to be efficient, the input images are resized to a relatively small spatial resolution (e.g. 224x224), and both training and inference are carried out at this resolution. The actual mechanism for this re-scaling has been an afterthought: Namely, off-the-shelf image resizers such as bilinear and bicubic are commonly used in most machine learning software frameworks. But do these resizers limit the on task performance of the trained networks? The answer is yes. Indeed, we show that the typical linear resizer can be replaced with learned resizers that can substantially improve performance. Importantly, while the classical resizers typically result in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.