Understanding the Fisher Vector: a multimodal part model

David Novotn\'y; Diane Larlus; Florent Perronnin; Andrea Vedaldi

arXiv:1504.04763·cs.CV·April 21, 2015·5 cites

Understanding the Fisher Vector: a multimodal part model

David Novotn\'y, Diane Larlus, Florent Perronnin, Andrea Vedaldi

PDF

Open Access

TL;DR

This paper interprets Fisher Vector object detectors as part-based models, explaining their strong performance and revealing properties like effective use of limited patches and visual words, and comparing them to Deformable Part Models.

Contribution

It provides a novel interpretation of Fisher Vectors as part-based models, offering insights into their performance and properties, and compares them with DPM detectors.

Findings

01

FV works well with limited input patches and visual words

02

FV can be interpreted as a part-based model

03

Comparison reveals similarities and differences with DPM

Abstract

Fisher Vectors and related orderless visual statistics have demonstrated excellent performance in object detection, sometimes superior to established approaches such as the Deformable Part Models. However, it remains unclear how these models can capture complex appearance variations using visual codebooks of limited sizes and coarse geometric information. In this work, we propose to interpret Fisher-Vector-based object detectors as part-based models. Through the use of several visualizations and experiments, we show that this is a useful insight to explain the good performance of the model. Furthermore, we reveal for the first time several interesting properties of the FV, including its ability to work well using only a small subset of input patches and visual words. Finally, we discuss the relation of the FV and DPM detectors, pointing out differences and commonalities between them.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Visual Attention and Saliency Detection