DeepFirearm: Learning Discriminative Feature Representation for Fine-grained Firearm Retrieval
Jiedong Hao, Jing Dong, Wei Wang, Tieniu Tan

TL;DR
DeepFirearm introduces a large firearm image dataset and a novel double margin contrastive loss with a two-stage training strategy, significantly improving fine-grained firearm retrieval accuracy over existing methods.
Contribution
The paper presents a new dataset Firearm 14k and a double margin contrastive loss with a two-stage training approach for enhanced firearm image retrieval.
Findings
Outperforms single margin contrastive loss by up to 88.5%
Surpasses triplet-loss-based approaches in accuracy
Demonstrates effectiveness of two-stage training for domain adaptation
Abstract
There are great demands for automatically regulating inappropriate appearance of shocking firearm images in social media or identifying firearm types in forensics. Image retrieval techniques have great potential to solve these problems. To facilitate research in this area, we introduce Firearm 14k, a large dataset consisting of over 14,000 images in 167 categories. It can be used for both fine-grained recognition and retrieval of firearm images. Recent advances in image retrieval are mainly driven by fine-tuning state-of-the-art convolutional neural networks for retrieval task. The conventional single margin contrastive loss, known for its simplicity and good performance, has been widely used. We find that it performs poorly on the Firearm 14k dataset due to: (1) Loss contributed by positive and negative image pairs is unbalanced during training process. (2) A huge domain gap exists…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Geophysical Methods and Applications · Domain Adaptation and Few-Shot Learning
