DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang,, Eric Tzeng, Trevor Darrell

TL;DR
This paper demonstrates that features from a deep convolutional network trained on large-scale object recognition can be effectively repurposed for various new visual recognition tasks, outperforming previous methods.
Contribution
It introduces DeCAF, a set of deep convolutional activation features, and shows their effectiveness across multiple visual recognition challenges.
Findings
DeCAF features outperform state-of-the-art on several vision tasks.
Features from different network levels have varying effectiveness.
Open-source implementation facilitates further research.
Abstract
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
