Learning Regional Attention over Multi-resolution Deep Convolutional   Features for Trademark Retrieval

Osman Tursun; Simon Denman; Sridha Sridharan; Clinton Fookes

arXiv:2104.07240·cs.CV·August 31, 2021

Learning Regional Attention over Multi-resolution Deep Convolutional Features for Trademark Retrieval

Osman Tursun, Simon Denman, Sridha Sridharan, Clinton Fookes

PDF

TL;DR

This paper enhances deep feature aggregation for trademark retrieval by introducing modifications to R-MAC, including multi-resolution inputs, soft-attention, and combined pooling, leading to improved accuracy on large-scale datasets.

Contribution

The authors propose three effective modifications to R-MAC for better handling background clutter, scale variance, and spatial information in trademark retrieval.

Findings

01

All modifications improve retrieval performance.

02

The combined approach surpasses previous state-of-the-art results.

03

Enhancements are validated on the METU dataset.

Abstract

Large-scale trademark retrieval is an important content-based image retrieval task. A recent study shows that off-the-shelf deep features aggregated with Regional-Maximum Activation of Convolutions (R-MAC) achieve state-of-the-art results. However, R-MAC suffers in the presence of background clutter/trivial regions and scale variance, and discards important spatial information. We introduce three simple but effective modifications to R-MAC to overcome these drawbacks. First, we propose the use of both sum and max pooling to minimise the loss of spatial information. We also employ domain-specific unsupervised soft-attention to eliminate background clutter and unimportant regions. Finally, we add multi-resolution inputs to enhance the scale-invariance of R-MAC. We evaluate these three modifications on the million-scale METU dataset. Our results show that all modifications bring…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMax Pooling