CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning
Yuxin Peng, Jinwei Qi, Yuxin Yuan

TL;DR
This paper introduces CM-GANs, a novel cross-modal GAN framework that models joint distributions of heterogeneous data like images and text to learn discriminative common representations, improving cross-modal retrieval.
Contribution
The paper proposes the first GAN-based approach for cross-modal common representation learning, combining generative and discriminative models with autoencoders and adversarial mechanisms.
Findings
Outperforms 10 methods on 3 datasets in cross-modal retrieval.
Effectively models joint distribution of different modalities.
Enhances discriminative power of common representations.
Abstract
It is known that the inconsistent distribution and representation of different modalities, such as image and text, cause the heterogeneity gap that makes it challenging to correlate such heterogeneous data. Generative adversarial networks (GANs) have shown its strong ability of modeling data distribution and learning discriminative representation, existing GANs-based works mainly focus on generative problem to generate new data. We have different goal, aim to correlate heterogeneous data, by utilizing the power of GANs to model cross-modal joint distribution. Thus, we propose Cross-modal GANs to learn discriminative common representation for bridging heterogeneity gap. The main contributions are: (1) Cross-modal GANs architecture is proposed to model joint distribution over data of different modalities. The inter-modality and intra-modality correlation can be explored simultaneously in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
