Learning Deep Bilinear Transformation for Fine-grained Image Representation
Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, and Jiebo Luo

TL;DR
This paper introduces a deep bilinear transformation (DBT) block that efficiently learns fine-grained image representations by reducing computational costs through group-wise pairwise interactions, achieving state-of-the-art results.
Contribution
The paper proposes a novel DBT block that enables deep stacking in CNNs for fine-grained recognition with reduced computation, outperforming previous methods.
Findings
Achieves state-of-the-art accuracy on CUB-Bird, Stanford-Car, FGVC-Aircraft datasets.
Reduces computational cost of bilinear transformations via group-wise interactions.
Demonstrates effectiveness of deep bilinear blocks in CNN architectures.
Abstract
Bilinear feature transformation has shown the state-of-the-art performance in learning fine-grained image representations. However, the computational cost to learn pairwise interactions between deep feature channels is prohibitively expensive, which restricts this powerful transformation to be used in deep neural networks. In this paper, we propose a deep bilinear transformation (DBT) block, which can be deeply stacked in convolutional neural networks to learn fine-grained image representations. The DBT block can uniformly divide input channels into several semantic groups. As bilinear transformation can be represented by calculating pairwise interactions within each group, the computational cost can be heavily relieved. The output of each block is further obtained by aggregating intra-group bilinear features, with residuals from the entire input features. We found that the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Imaging and Analysis · Domain Adaptation and Few-Shot Learning
