Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal   Sponsored Search

Yuanmin Tang; Jing Yu; Keke Gai; Yujing Wang; Yue Hu; Gang Xiong and; Qi Wu

arXiv:2309.16141·cs.CV·September 29, 2023·2 cites

Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search

Yuanmin Tang, Jing Yu, Keke Gai, Yujing Wang, Yue Hu, Gang Xiong and, Qi Wu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an explicit alignment network for cross-modal sponsored search that improves query-ads matching by aligning visual and textual details without requiring extensive labeled data, outperforming existing models.

Contribution

The work proposes a simple, effective alignment network for fine-grained image-text mapping in cross-modal search, enhancing performance with less training data.

Findings

01

Outperforms state-of-the-art by 2.57% on commercial dataset

02

Effective in cross-modal retrieval on MSCOCO dataset

03

Requires only half the training data for comparable performance

Abstract

Cross-Modal sponsored search displays multi-modal advertisements (ads) when consumers look for desired products by natural language queries in search engines. Since multi-modal ads bring complementary details for query-ads matching, the ability to align ads-specific information in both images and texts is crucial for accurate and flexible sponsored search. Conventional research mainly studies from the view of modeling the implicit correlations between images and texts for query-ads matching, ignoring the alignment of detailed product information and resulting in suboptimal search performance.In this work, we propose a simple alignment network for explicitly mapping fine-grained visual parts in ads images to the corresponding text, which leverages the co-occurrence structure consistency between vision and language spaces without requiring expensive labeled training data. Moreover, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pter61/aligncmss
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning

MethodsALIGN