Capturing spatial interdependence in image features: the counting grid, an epitomic representation for bags of features
Alessandro Perina, Nebojsa Jojic

TL;DR
This paper introduces the counting grid model, a novel generative approach that captures spatial interdependence in image features more effectively than traditional bag-of-features models, especially for scene recognition.
Contribution
The paper proposes the counting grid model, a new generative framework for representing image features that accounts for spatial constraints and interdependence, improving scene recognition accuracy.
Findings
Better representation of feature count variations in scenes.
Improved scene recognition across different images and categories.
More accurate modeling of spatial feature distributions.
Abstract
In recent scene recognition research images or large image regions are often represented as disorganized "bags" of features which can then be analyzed using models originally developed to capture co-variation of word counts in text. However, image feature counts are likely to be constrained in different ways than word counts in text. For example, as a camera pans upwards from a building entrance over its first few floors and then further up into the sky Fig. 1, some feature counts in the image drop while others rise -- only to drop again giving way to features found more often at higher elevations. The space of all possible feature count combinations is constrained both by the properties of the larger scene and the size and the location of the window into it. To capture such variation, in this paper we propose the use of the counting grid model. This generative model is based on a grid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Video Analysis and Summarization
