Compact Bilinear Pooling
Yang Gao, Oscar Beijbom, Ning Zhang, Trevor Darrell

TL;DR
This paper introduces two compact bilinear pooling methods that retain the discriminative power of full bilinear features while significantly reducing dimensionality, enabling efficient end-to-end training for visual recognition tasks.
Contribution
The paper presents novel compact bilinear pooling techniques derived from a kernelized analysis, improving efficiency without sacrificing accuracy in visual recognition.
Findings
Achieves comparable performance to full bilinear features with only a few thousand dimensions.
Enables end-to-end training through back-propagation of classification errors.
Effective for image classification and few-shot learning across multiple datasets.
Abstract
Bilinear models has been shown to achieve impressive performance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition. However, bilinear features are high dimensional, typically on the order of hundreds of thousands to a few million, which makes them impractical for subsequent analysis. We propose two compact bilinear representations with the same discriminative power as the full bilinear representation but with only a few thousand dimensions. Our compact representations allow back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system. The compact bilinear representations are derived through a novel kernelized analysis of bilinear pooling which provide insights into the discriminative power of bilinear pooling, and a platform for further research in compact pooling methods.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
