Revisiting Relevance Feedback for CLIP-based Interactive Image Retrieval
Ryoya Nara, Yu-Chieh Lin, Yuji Nozawa, Youyang Ng, Goh Itoh, Osamu, Torii, Yusuke Matsui

TL;DR
This paper introduces an interactive image retrieval system using CLIP and relevance feedback, which adapts to user preferences without training new models, competing with state-of-the-art metric learning methods.
Contribution
It revisits relevance feedback for CLIP-based retrieval, enabling preference adaptation without training, and demonstrates effectiveness across various user preference scenarios.
Findings
Competitive accuracy with state-of-the-art metric learning methods.
Effective adaptation to diverse user preferences.
Improved retrieval accuracy with relevance feedback.
Abstract
Many image retrieval studies use metric learning to train an image encoder. However, metric learning cannot handle differences in users' preferences, and requires data to train an image encoder. To overcome these limitations, we revisit relevance feedback, a classic technique for interactive retrieval systems, and propose an interactive CLIP-based image retrieval system with relevance feedback. Our retrieval system first executes the retrieval, collects each user's unique preferences through binary feedback, and returns images the user prefers. Even when users have various preferences, our retrieval system learns each user's preference through the feedback and adapts to the preference. Moreover, our retrieval system leverages CLIP's zero-shot transferability and achieves high accuracy without training. We empirically show that our retrieval system competes well with state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques
MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training
