# Learning cross space mapping via DNN using large scale click-through   logs

**Authors:** Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

arXiv: 2302.13275 · 2023-02-28

## TL;DR

This paper introduces a deep neural network model called cross space mapping (CSM) that maps images and queries into a common space for improved image-query similarity measurement, trained on large-scale click-through logs.

## Contribution

The paper proposes a unified DNN model for image-query similarity that jointly models images and queries in a shared space, trained on extensive click-through data.

## Key findings

- The CSM model outperforms existing methods in image retrieval accuracy.
- Training on 23 million click pairs enhances generalization and robustness.
- Qualitative and quantitative evaluations confirm the effectiveness of the approach.

## Abstract

The gap between low-level visual signals and high-level semantics has been progressively bridged by continuous development of deep neural network (DNN). With recent progress of DNN, almost all image classification tasks have achieved new records of accuracy. To extend the ability of DNN to image retrieval tasks, we proposed a unified DNN model for image-query similarity calculation by simultaneously modeling image and query in one network. The unified DNN is named the cross space mapping (CSM) model, which contains two parts, a convolutional part and a query-embedding part. The image and query are mapped to a common vector space via these two parts respectively, and image-query similarity is naturally defined as an inner product of their mappings in the space. To ensure good generalization ability of the DNN, we learn weights of the DNN from a large number of click-through logs which consists of 23 million clicked image-query pairs between 1 million images and 11.7 million queries. Both the qualitative results and quantitative results on an image retrieval evaluation task with 1000 queries demonstrate the superiority of the proposed method.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.13275/full.md

## Figures

11 figures with captions in the complete paper: https://tomesphere.com/paper/2302.13275/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/2302.13275/full.md

---
Source: https://tomesphere.com/paper/2302.13275