Mining Mid-level Visual Patterns with Deep CNN Activations
Yao Li, Lingqiao Liu, Chunhua Shen, Anton van den Hengel

TL;DR
This paper introduces a method to discover mid-level visual patterns using CNN activations combined with pattern mining techniques, leading to effective image representations for classification tasks.
Contribution
It proposes integrating CNN activations with association rule mining to efficiently find semantically meaningful visual patterns and mid-level elements.
Findings
Outperforms recent CNN-based methods in scene and object classification
Effective discovery of semantically consistent visual patterns
Two novel image representation methods based on patterns and elements
Abstract
The purpose of mid-level visual element discovery is to find clusters of image patches that are both representative and discriminative. Here we study this problem from the prospective of pattern mining while relying on the recently popularized Convolutional Neural Networks (CNNs). We observe that a fully-connected CNN activation extracted from an image patch typically possesses two appealing properties that enable its seamless integration with pattern mining techniques. The marriage between CNN activations and association rule mining, a well-known pattern mining technique in the literature, leads to fast and effective discovery of representative and discriminative patterns from a huge number of image patches. When we retrieve and visualize image patches with the same pattern, surprisingly, they are not only visually similar but also semantically consistent, and thus give rise to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
