Improved Bilinear Pooling with CNNs

Tsung-Yu Lin; Subhransu Maji

arXiv:1707.06772·cs.CV·July 24, 2017·31 cites

Improved Bilinear Pooling with CNNs

Tsung-Yu Lin, Subhransu Maji

PDF

Open Access

TL;DR

This paper enhances bilinear CNN pooling by applying matrix square-root normalization, leading to improved accuracy in fine-grained recognition tasks and proposing efficient gradient computation methods for training.

Contribution

It demonstrates that matrix square-root normalization significantly boosts bilinear pooling performance and introduces faster, accurate gradient computation techniques for network training.

Findings

01

Matrix square-root normalization improves recognition accuracy by 2-3%.

02

Approximate Newton iterations for matrix square-root are faster and equally effective.

03

Numerical inaccuracies in SVD gradients have negligible impact on final accuracy.

Abstract

Bilinear pooling of Convolutional Neural Network (CNN) features [22, 23], and their compact variants [10], have been shown to be effective at fine-grained recognition, scene categorization, texture recognition, and visual question-answering tasks among others. The resulting representation captures second-order statistics of convolutional features in a translationally invariant manner. In this paper we investigate various ways of normalizing these statistics to improve their representation power. In particular we find that the matrix square-root normalization offers significant improvements and outperforms alternative schemes such as the matrix logarithm normalization when combined with elementwise square-root and l2 normalization. This improves the accuracy by 2-3% on a range of fine-grained recognition datasets leading to a new state of the art. We also investigate how the accuracy of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning