Training on Synthetic Data Beats Real Data in Multimodal Relation Extraction
Zilin Du, Haoxin Li, Xu Guo, Boyang Li

TL;DR
This paper introduces MI2RAGE, a novel method for training multimodal relation extraction models using only unimodal data and synthetic generation, achieving superior performance over models trained on real data.
Contribution
The paper proposes MI2RAGE, a new approach that enhances synthetic data diversity and label retention for effective multimodal relation extraction training.
Findings
Synthetic data training improves F1 by over 24% with text and 26% with images.
Model trained on synthetic images surpasses state-of-the-art real-data models by 3.76% F1.
Chained Cross-modal Generation boosts data diversity in synthetic datasets.
Abstract
The task of multimodal relation extraction has attracted significant research attention, but progress is constrained by the scarcity of available training data. One natural thought is to extend existing datasets with cross-modal generative models. In this paper, we consider a novel problem setting, where only unimodal data, either text or image, are available during training. We aim to train a multimodal classifier from synthetic data that perform well on real multimodal test data. However, training with synthetic data suffers from two obstacles: lack of data diversity and label information loss. To alleviate the issues, we propose Mutual Information-aware Multimodal Iterated Relational dAta GEneration (MI2RAGE), which applies Chained Cross-modal Generation (CCG) to promote diversity in the generated data and exploits a teacher network to select valuable training samples with high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Topic Modeling
