Fuse and Attend: Generalized Embedding Learning for Art and Sketches
Ujjal Kr Dutta

TL;DR
This paper introduces a novel embedding learning method that employs gated fusion and attention to improve cross-domain object recognition, particularly in challenging domains like sketches, by leveraging contrastive learning to enhance robustness and discrimination.
Contribution
The paper proposes a new generalized embedding learning approach using gated fusion and attention, specifically designed to perform well across diverse visual domains including sketches and art.
Findings
Effective cross-domain generalization demonstrated on PACS dataset
Improved robustness of embeddings across multiple visual domains
Superior performance compared to existing methods in domain adaptation
Abstract
While deep Embedding Learning approaches have witnessed widespread success in multiple computer vision tasks, the state-of-the-art methods for representing natural images need not necessarily perform well on images from other domains, such as paintings, cartoons, and sketch. This is because of the huge shift in the distribution of data from across these domains, as compared to natural images. Domains like sketch often contain sparse informative pixels. However, recognizing objects in such domains is crucial, given multiple relevant applications leveraging such data, for instance, sketch to image retrieval. Thus, achieving an Embedding Learning model that could perform well across multiple domains is not only challenging, but plays a pivotal role in computer vision. To this end, in this paper, we propose a novel Embedding Learning approach with the goal of generalizing across different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
MethodsContrastive Learning
