Image-Text Pre-Training for Logo Recognition

Mark Hubenthal; Suren Kumar

arXiv:2309.10206·cs.CV·September 20, 2023

Image-Text Pre-Training for Logo Recognition

Mark Hubenthal, Suren Kumar

PDF

1 Video

TL;DR

This paper introduces a novel image-text pre-training approach and an improved metric learning loss to enhance open-set logo recognition, significantly boosting performance across multiple datasets.

Contribution

It proposes using image-text paired pre-training and a new loss function, ProxyNCAHN++, to improve logo matching accuracy, especially for text-rich logos.

Findings

01

Pre-training on image-text data improves logo retrieval performance.

02

The method achieves state-of-the-art results on five public logo datasets.

03

Significant performance gains in zero-shot recall@1 across datasets.

Abstract

Open-set logo recognition is commonly solved by first detecting possible logo regions and then matching the detected parts against an ever-evolving dataset of cropped logo images. The matching model, a metric learning problem, is especially challenging for logo recognition due to the mixture of text and symbols in logos. We propose two novel contributions to improve the matching model's performance: (a) using image-text paired samples for pre-training, and (b) an improved metric learning loss function. A standard paradigm of fine-tuning ImageNet pre-trained models fails to discover the text sensitivity necessary to solve the matching problem effectively. This work demonstrates the importance of pre-training on image-text pairs, which significantly improves the performance of a visual embedder trained for the logo retrieval task, especially for more text-dominant classes. We construct a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Image-Text Pre-Training for Logo Recognition· youtube