# Generalized orderless pooling performs implicit salient matching

**Authors:** Marcel Simon, Yang Gao, Trevor Darrell, Joachim Denzler, Erik Rodner

arXiv: 1705.00487 · 2017-07-21

## TL;DR

This paper introduces alpha-pooling, a learnable pooling method that generalizes average and bilinear pooling, improving fine-grained recognition performance and providing interpretability of model decisions.

## Contribution

The paper proposes alpha-pooling, a novel learnable pooling strategy, and a visualization method for understanding model decisions in fine-grained recognition tasks.

## Key findings

- Alpha-pooling outperforms average and bilinear pooling on standard datasets.
- The visualization method reveals influential image parts for predictions.
- Higher-capacity models focus more on salient features like bird heads.

## Abstract

Most recent CNN architectures use average pooling as a final feature encoding step. In the field of fine-grained recognition, however, recent global representations like bilinear pooling offer improved performance. In this paper, we generalize average and bilinear pooling to "alpha-pooling", allowing for learning the pooling strategy during training. In addition, we present a novel way to visualize decisions made by these approaches. We identify parts of training images having the highest influence on the prediction of a given test image. It allows for justifying decisions to users and also for analyzing the influence of semantic parts. For example, we can show that the higher capacity VGG16 model focuses much more on the bird's head than, e.g., the lower-capacity VGG-M model when recognizing fine-grained bird categories. Both contributions allow us to analyze the difference when moving between average and bilinear pooling. In addition, experiments show that our generalized approach can outperform both across a variety of standard datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.00487/full.md

## Figures

34 figures with captions in the complete paper: https://tomesphere.com/paper/1705.00487/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/1705.00487/full.md

---
Source: https://tomesphere.com/paper/1705.00487