Natural Scene Image Annotation Using Local Semantic Concepts and Spatial Bag of Visual Words
Yousef Alqasrawi

TL;DR
This paper presents a framework for automatically annotating natural scene images with semantic labels using local features, a bag of visual words model, and machine learning, emphasizing the importance of local semantic concepts.
Contribution
It introduces a novel approach that leverages local semantic concepts and spatial BOW to improve image annotation accuracy, including the use of image halves for vocabulary generation.
Findings
BOW effectively represents semantic information in image regions.
Using image halves for vocabulary improves annotation performance.
The approach achieves promising results on natural scene datasets.
Abstract
The use of bag of visual words (BOW) model for modelling images based on local invariant features computed at interest point locations has become a standard choice for many computer vision tasks. Visual vocabularies generated from image feature vectors are expected to produce visual words that are discriminative to improve the performance of image annotation systems. Most techniques that adopt the BOW model in annotating images declined favorable information that can be mined from image categories to build discriminative visual vocabularies. To this end, this paper introduces a detailed framework for automatically annotating natural scene images with local semantic labels from a predefined vocabulary. The framework is based on a hypothesis that assumes that, in natural scenes, intermediate semantic concepts are correlated with the local keypoints. Based on this hypothesis, image regions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
MethodsSupport Vector Machine
