Albumentations: fast and flexible image augmentations

Alexander Buslaev; Alex Parinov; Eugene Khvedchenya; Vladimir I.; Iglovikov; Alexandr A. Kalinin

arXiv:1809.06839·cs.CV·February 27, 2020

Albumentations: fast and flexible image augmentations

Alexander Buslaev, Alex Parinov, Eugene Khvedchenya, Vladimir I., Iglovikov, Alexandr A. Kalinin

PDF

4 Repos

TL;DR

Albumentations is a fast, flexible, and easy-to-use image augmentation library that offers a wide range of transformations, improving training efficiency and performance in computer vision tasks.

Contribution

The paper introduces Albumentations, a new image augmentation library that is faster and more versatile than existing tools, with easy integration and extensive transformation options.

Findings

01

Albumentations outperforms existing augmentation tools in speed.

02

It provides a wide variety of image transformations.

03

The library is easy to integrate into deep learning workflows.

Abstract

Data augmentation is a commonly used technique for increasing both the size and the diversity of labeled training sets by leveraging input transformations that preserve output labels. In computer vision domain, image augmentations have become a common implicit regularization technique to combat overfitting in deep convolutional neural networks and are ubiquitously used to improve performance. While most deep learning frameworks implement basic image transformations, the list is typically limited to some variations and combinations of flipping, rotating, scaling, and cropping. Moreover, the image processing speed varies in existing tools for image augmentation. We present Albumentations, a fast and flexible library for image augmentations with many various image transform operations available, that is also an easy-to-use wrapper around other augmentation libraries. We provide examples of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings