Feature and Region Selection for Visual Learning
Ji Zhao, Liantao Wang, Ricardo Cabral, Fernando De la Torre

TL;DR
This paper introduces a feature and region selection method for the bag-of-words model in visual learning, enabling better understanding and visualization of discriminative features in images and videos.
Contribution
It proposes a joint optimization approach for feature and region selection that is compatible with non-linear kernels and handles both images and videos in a unified framework.
Findings
Improved visualization of important features and regions.
Enhanced classification performance on benchmark datasets.
Connections established with multiple kernel and multiple instance learning.
Abstract
Visual learning problems such as object classification and action recognition are typically approached using extensions of the popular bag-of-words (BoW) model. Despite its great success, it is unclear what visual features the BoW model is learning: Which regions in the image or video are used to discriminate among classes? Which are the most discriminative visual words? Answering these questions is fundamental for understanding existing BoW models and inspiring better models for visual recognition. To answer these questions, this paper presents a method for feature selection and region selection in the visual BoW model. This allows for an intermediate visualization of the features and regions that are important for visual learning. The main idea is to assign latent weights to the features or regions, and jointly optimize these latent variables with the parameters of a classifier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
