Evidential Transformers for Improved Image Retrieval
Danilo Dordevic, Suryansh Kumar

TL;DR
This paper presents the Evidential Transformer, a probabilistic model that enhances image retrieval robustness and accuracy by integrating uncertainty estimation and global context, outperforming previous methods on standard datasets.
Contribution
Introduction of the Evidential Transformer with probabilistic methods and global context architecture for improved and reliable image retrieval.
Findings
Achieved state-of-the-art results on SOP and CUB-200-2011 datasets.
Demonstrated the effectiveness of evidential classification over traditional methods.
Established a new benchmark in content-based image retrieval.
Abstract
We introduce the Evidential Transformer, an uncertainty-driven transformer model for improved and robust image retrieval. In this paper, we make several contributions to content-based image retrieval (CBIR). We incorporate probabilistic methods into image retrieval, achieving robust and reliable results, with evidential classification surpassing traditional training based on multiclass classification as a baseline for deep metric learning. Furthermore, we improve the state-of-the-art retrieval results on several datasets by leveraging the Global Context Vision Transformer (GC ViT) architecture. Our experimental results consistently demonstrate the reliability of our approach, setting a new benchmark in CBIR in all test settings on the Stanford Online Products (SOP) and CUB-200-2011 datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
MethodsByte Pair Encoding · Absolute Position Encodings · Vision Transformer · Softmax · Label Smoothing · Linear Layer · Adam · Dropout · Layer Normalization · Dense Connections
