MGAug: Multimodal Geometric Augmentation in Latent Spaces of Image Deformations
Tonmoy Hossain, Miaomiao Zhang

TL;DR
MGAug introduces a novel multimodal latent space approach for geometric image augmentation, improving data diversity and model performance in classification and segmentation tasks.
Contribution
This paper presents MGAug, a new method that generates diverse geometric augmentations in a multimodal latent space using a VAE and Gaussian mixture prior.
Findings
Outperforms state-of-the-art augmentation methods
Improves accuracy in 2D classification tasks
Enhances segmentation results on 3D MRI data
Abstract
Geometric transformations have been widely used to augment the size of training images. Existing methods often assume a unimodal distribution of the underlying transformations between images, which limits their power when data with multimodal distributions occur. In this paper, we propose a novel model, Multimodal Geometric Augmentation (MGAug), that for the first time generates augmenting transformations in a multimodal latent space of geometric deformations. To achieve this, we first develop a deep network that embeds the learning of latent geometric spaces of diffeomorphic transformations (a.k.a. diffeomorphisms) in a variational autoencoder (VAE). A mixture of multivariate Gaussians is formulated in the tangent space of diffeomorphisms and serves as a prior to approximate the hidden distribution of image transformations. We then augment the original training dataset by deforming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · AI in cancer detection · Medical Image Segmentation Techniques
