Sampled Image Tagging and Retrieval Methods on User Generated Content
Karl Ni, Kyle Zaragoza, Charles Foster, Carmen Carrano, Barry Chen,, Yonas Tesfaye, Alex Gude

TL;DR
This paper introduces a deep learning approach for image tagging and retrieval using user-generated content, leveraging sampling and joint word embedding optimization to improve robustness and zero-shot performance on large-scale unstructured data.
Contribution
It presents a novel method that combines sampling and joint word embedding training on UGC data, enhancing robustness and zero-shot capabilities in image tagging and retrieval.
Findings
Comparable to state-of-the-art in conventional tagging
Improves zero-shot image retrieval performance
Handles unstructured, multilingual, and noisy tags effectively
Abstract
Traditional image tagging and retrieval algorithms have limited value as a result of being trained with heavily curated datasets. These limitations are most evident when arbitrary search words are used that do not intersect with training set labels. Weak labels from user generated content (UGC) found in the wild (e.g., Google Photos, FlickR, etc.) have an almost unlimited number of unique words in the metadata tags. Prior work on word embeddings successfully leveraged unstructured text with large vocabularies, and our proposed method seeks to apply similar cost functions to open source imagery. Specifically, we train a deep learning image tagging and retrieval system on large scale, user generated content (UGC) using sampling methods and joint optimization of word embeddings. By using the Yahoo! FlickR Creative Commons (YFCC100M) dataset, such an approach builds robustness to common…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
