EMIXER: End-to-end Multimodal X-ray Generation via Self-supervision

Siddharth Biswal; Peiye Zhuang; Ayis Pyrros; Nasir Siddiqui; Sanmi; Koyejo; Jimeng Sun

arXiv:2007.05597·eess.IV·January 19, 2021·1 cites

EMIXER: End-to-end Multimodal X-ray Generation via Self-supervision

Siddharth Biswal, Peiye Zhuang, Ayis Pyrros, Nasir Siddiqui, Sanmi, Koyejo, Jimeng Sun

PDF

Open Access

TL;DR

EMIXER is an end-to-end multimodal generative model that synthesizes X-ray images and reports conditioned on diagnosis labels, leveraging self-supervision to improve clinical data augmentation and machine learning tasks.

Contribution

This paper introduces EMIXER, a novel multimodal generative adversarial network that jointly synthesizes X-ray images and reports with self-supervision, enhancing clinical data augmentation.

Findings

01

Synthetic data improves COVID-19 X-ray classification by 5.94%

02

Generated images and reports are validated by radiologists

03

Data augmentation boosts report generation and classification performance

Abstract

Deep generative models have enabled the automated synthesis of high-quality data for diverse applications. However, the most effective generative models are specialized to data from a single domain (e.g., images or text). Real-world applications such as healthcare require multi-modal data from multiple domains (e.g., both images and corresponding text), which are difficult to acquire due to limited availability and privacy concerns and are much harder to synthesize. To tackle this joint synthesis challenge, we propose an End-to-end MultImodal X-ray genERative model (EMIXER) for jointly synthesizing x-ray images and corresponding free-text reports, all conditional on diagnosis labels. EMIXER is an conditional generative adversarial model by 1) generating an image based on a label, 2) encoding the image to a hidden embedding, 3) producing the corresponding text via a hierarchical decoder…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Multimodal Machine Learning Applications