AnANet: Modeling Association and Alignment for Cross-modal Correlation Classification
Nan Xu, Junyan Wang, Yuan Tian, Ruike Zhang, and Wenji Mao

TL;DR
This paper introduces AnANet, a novel model that captures both implicit associations and explicit alignments to classify cross-modal image-text correlations more comprehensively.
Contribution
It redefines cross-modal correlation classification based on implicit association and explicit alignment, and proposes AnANet to model these aspects effectively.
Findings
AnANet outperforms existing models on a new image-text correlation dataset.
The model effectively captures global discrepancy and local relevance.
Experimental results validate the proposed classification system.
Abstract
The explosive increase of multimodal data makes a great demand in many cross-modal applications that follow the strict prior related assumption. Thus researchers study the definition of cross-modal correlation category and construct various classification systems and predictive models. However, those systems pay more attention to the fine-grained relevant types of cross-modal correlation, ignoring lots of implicit relevant data which are often divided into irrelevant types. What's worse is that none of previous predictive models manifest the essence of cross-modal correlation according to their definition at the modeling stage. In this paper, we present a comprehensive analysis of the image-text correlation and redefine a new classification system based on implicit association and explicit alignment. To predict the type of image-text correlation, we propose the Association and Alignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
