Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred   Thousand-Scale One-Shot Logo Identification

Nakul Sharma; Abhirama S. Penamakuri; Anand Mishra

arXiv:2211.12926·cs.CV·November 24, 2022

Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification

Nakul Sharma, Abhirama S. Penamakuri, Anand Mishra

PDF

1 Repo

TL;DR

This paper introduces a contrastive multi-view encoding framework for large-scale one-shot logo identification, leveraging textual and visual features, and presents a new dataset with 100K logos to advance research in open-set recognition.

Contribution

The paper proposes a novel multi-view textual-visual encoding method for one-shot logo recognition and introduces WiRLD, a large-scale logo dataset with 100K logos, addressing scalability and dataset gaps.

Findings

01

Achieves 91.3% AUC on QMUL-OpenLogo verification

02

Outperforms state-of-the-art by 9.1% and 2.6% on Toplogos-10 and FlickrLogos32

03

More stable at 100K candidate logos scale

Abstract

In this paper, we study the problem of identifying logos of business brands in natural scenes in an open-set one-shot setting. This problem setup is significantly more challenging than traditionally-studied 'closed-set' and 'large-scale training samples per category' logo recognition settings. We propose a novel multi-view textual-visual encoding framework that encodes text appearing in the logos as well as the graphical design of the logos to learn robust contrastive representations. These representations are jointly learned for multiple views of logos over a batch and thereby they generalize well to unseen logos. We evaluate our proposed framework for cropped logo verification, cropped logo identification, and end-to-end logo identification in natural scene tasks; and compare it against state-of-the-art methods. Further, the literature lacks a 'very-large-scale' collection of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

0xnakul/one-shot-logo_icvgip
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.