Generalisation and Sharing in Triplet Convnets for Sketch based Visual Search
Tu Bui, Leonardo Ribeiro, Moacir Ponti, John Collomosse

TL;DR
This paper introduces triplet CNN architectures for sketch-based image retrieval, demonstrating improved generalization across categories and outperforming existing methods on key benchmarks.
Contribution
It presents novel triplet CNN models with strategies for weight sharing and data augmentation, enhancing generalization in sketch-photo similarity tasks.
Findings
Achieved 18% improvement on Flickr15k SBIR benchmark
Surpassed TU-Berlin SBIR benchmark by ~10%
Effective use of limited training data and data augmentation
Abstract
We propose and evaluate several triplet CNN architectures for measuring the similarity between sketches and photographs, within the context of the sketch based image retrieval (SBIR) task. In contrast to recent fine-grained SBIR work, we study the ability of our networks to generalise across diverse object categories from limited training data, and explore in detail strategies for weight sharing, pre-processing, data augmentation and dimensionality reduction. We exceed the performance of pre-existing techniques on both the Flickr15k category level SBIR benchmark by , and the TU-Berlin SBIR benchmark by , when trained on the 250 category TU-Berlin classification dataset augmented with 25k corresponding photographs harvested from the Internet.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
