Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal   Data

Yi Yu; Suhua Tang; Kiyoharu Aizawa; Akiko Aizawa

arXiv:1805.02997·cs.CV·May 9, 2018·5 cites

Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data

Yi Yu, Suhua Tang, Kiyoharu Aizawa, Akiko Aizawa

PDF

Open Access

TL;DR

This paper introduces a novel deep learning model called Category-based Deep CCA for fine-grained venue discovery using multimodal data, enabling exact and group venue searches by correlating images and textual descriptions.

Contribution

The paper proposes a new deep CCA model that jointly optimizes cross-modal correlations for venue discovery, incorporating category-based grouping and a new multimodal dataset.

Findings

01

Outperforms state-of-the-art methods in cross-modal retrieval tasks.

02

Effectively captures venue features from multimodal data.

03

Enables accurate venue search using user-generated photos.

Abstract

In this work, travel destination and business location are taken as venues. Discovering a venue by a photo is very important for context-aware applications. Unfortunately, few efforts paid attention to complicated real images such as venue photos generated by users. Our goal is fine-grained venue discovery from heterogeneous social multimodal data. To this end, we propose a novel deep learning model, Category-based Deep Canonical Correlation Analysis (C-DCCA). Given a photo as input, this model performs (i) exact venue search (find the venue where the photo was taken), and (ii) group venue search (find relevant venues with the same category as that of the photo), by the cross-modal correlation between the input photo and textual description of venues. In this model, data in different modalities are projected to a same space via deep networks. Pairwise correlation (between different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques