RecConv: Efficient Recursive Convolutions for Multi-Frequency Representations
Mingshu Zhao, Yi Luo, Yong Ouyang

TL;DR
RecConv introduces a recursive convolution strategy that efficiently expands the receptive field with minimal parameter increase and constant FLOPs, enabling more efficient vision transformer models.
Contribution
It proposes RecConv, a recursive decomposition method for multi-frequency representations that maintains constant FLOPs while significantly increasing the receptive field.
Findings
RecConv achieves a linear parameter growth with decomposition levels.
RecConv maintains constant FLOPs regardless of receptive field expansion.
RecNeXt-M3 outperforms comparable models with similar FLOPs on COCO.
Abstract
Recent advances in vision transformers (ViTs) have demonstrated the advantage of global modeling capabilities, prompting widespread integration of large-kernel convolutions for enlarging the effective receptive field (ERF). However, the quadratic scaling of parameter count and computational complexity (FLOPs) with respect to kernel size poses significant efficiency and optimization challenges. This paper introduces RecConv, a recursive decomposition strategy that efficiently constructs multi-frequency representations using small-kernel convolutions. RecConv establishes a linear relationship between parameter growth and decomposing levels which determines the effective receptive field for a base kernel and levels of decomposition, while maintaining constant FLOPs regardless of the ERF expansion. Specifically, RecConv achieves a parameter expansion of only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗suous/recnext_t.base_300e_in1kmodel· 22 dl22 dl
- 🤗suous/recnext_t.dist_300e_in1kmodel· 7 dl7 dl
- 🤗suous/recnext_s.base_300e_in1kmodel· 10 dl10 dl
- 🤗suous/recnext_b.base_300e_in1kmodel· 16 dl16 dl
- 🤗suous/recnext_s.dist_300e_in1kmodel· 5 dl5 dl
- 🤗suous/recnext_b.dist_300e_in1kmodel· 7 dl7 dl
- 🤗suous/recnext_a0.base_300e_in1kmodel· 11 dl11 dl
- 🤗suous/recnext_a1.base_300e_in1kmodel· 11 dl11 dl
- 🤗suous/recnext_a2.base_300e_in1kmodel· 6 dl6 dl
- 🤗suous/recnext_a3.base_300e_in1kmodel· 10 dl10 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Neural Networks and Applications
MethodsBalanced Selection
