OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Meng Lou; Yizhou Yu

arXiv:2502.20087·cs.CV·October 13, 2025

OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels

Meng Lou, Yizhou Yu

PDF

1 Repo

TL;DR

OverLoCK introduces a novel ConvNet architecture inspired by human visual attention, combining overview and detailed perception through a multi-branch design and dynamic context-mixing convolutions, achieving superior accuracy with fewer resources.

Contribution

This work presents the first ConvNet backbone with explicit top-down attention, integrating overview-first and look-closely-next principles with a new context-mixing dynamic convolution.

Findings

01

OverLoCK-T achieves 84.2% Top-1 accuracy, surpassing ConvNeXt-B.

02

OverLoCK-S outperforms MogaNet-B by 1% in object detection AP^b.

03

OverLoCK-T improves UniRepLKNet-T by 1.7% in semantic segmentation mIoU.

Abstract

Top-down attention plays a crucial role in the human vision system, wherein the brain initially obtains a rough overview of a scene to discover salient cues (i.e., overview first), followed by a more careful finer-grained examination (i.e., look closely next). However, modern ConvNets remain confined to a pyramid structure that successively downsamples the feature map for receptive field expansion, neglecting this crucial biomimetic principle. We present OverLoCK, the first pure ConvNet backbone architecture that explicitly incorporates a top-down attention mechanism. Unlike pyramid backbone networks, our design features a branched architecture with three synergistic sub-networks: 1) a Base-Net that encodes low/mid-level features; 2) a lightweight Overview-Net that generates dynamic top-down attention through coarse global context modeling (i.e., overview first); and 3) a robust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lmmmeng/overlock
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Region Proposal Network · RoIAlign · Softmax · Mask R-CNN · Cascade Mask R-CNN · Convolution