Efficiency Follows Global-Local Decoupling
Zhenyu Yang, Gensheng Pei, Tao Chen, Yichao Zhou, Tianfei Zhou, Yazhou Yao, Fumin Shen

TL;DR
This paper proposes ConvNeur, a two-branch neural architecture that decouples global reasoning from local feature extraction, improving efficiency and performance in vision tasks by reducing computational overhead and maintaining local detail.
Contribution
Introduction of ConvNeur, a novel global-local decoupled architecture that enhances efficiency and accuracy in vision models through a simple, effective design.
Findings
ConvNeur achieves comparable or better accuracy than existing models.
It scales subquadratically with image size, reducing computational costs.
ConvNeur offers favorable accuracy-latency trade-offs.
Abstract
Modern vision models must capture image-level context without sacrificing local detail while remaining computationally affordable. We revisit this tradeoff and advance a simple principle: decouple the roles of global reasoning and local representation. To operationalize this principle, we introduce ConvNeur, a two-branch architecture in which a lightweight neural memory branch aggregates global context on a compact set of tokens, and a locality-preserving branch extracts fine structure. A learned gate lets global cues modulate local features without entangling their objectives. This separation yields subquadratic scaling with image size, retains inductive priors associated with local processing, and reduces overhead relative to fully global attention. On standard classification, detection, and segmentation benchmarks, ConvNeur matches or surpasses comparable alternatives at similar or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
