Mid-level Representation for Visual Recognition

Moin Nabi

arXiv:1512.07314·cs.CV·December 24, 2015·1 cites

Mid-level Representation for Visual Recognition

Moin Nabi

PDF

Open Access

TL;DR

This paper explores the use of mid-level visual representations, such as parts and attributes, to improve high-level visual recognition tasks like object detection and understanding in images and videos.

Contribution

It introduces a subcategory-aware, webly-supervised approach for discovering discriminative mid-level patches to enhance object recognition and address dataset bias.

Findings

01

Discovered effective mid-level patches for object recognition.

02

Improved recognition accuracy using subcategory-based models.

03

Addressed dataset bias through subcategory-aware modeling.

Abstract

Visual Recognition is one of the fundamental challenges in AI, where the goal is to understand the semantics of visual data. Employing mid-level representation, in particular, shifted the paradigm in visual recognition. The mid-level image/video representation involves discovering and training a set of mid-level visual patterns (e.g., parts and attributes) and represent a given image/video utilizing them. The mid-level patterns can be extracted from images and videos using the motion and appearance information of visual phenomenas. This thesis targets employing mid-level representations for different high-level visual recognition tasks, namely (i)image understanding and (ii)video understanding. In the case of image understanding, we focus on object detection/recognition task. We investigate on discovering and learning a set of mid-level patches to be used for representing the images…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Image and Video Retrieval Techniques · Image Processing Techniques and Applications