Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer
Md Ashiqur Rahman, Chiao-An Yang, Michael N. Cheng, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh

TL;DR
This paper introduces a deep equilibrium canonicalizer (DEC) that enhances local scale equivariance in models, improving performance and consistency across various architectures on ImageNet.
Contribution
The paper proposes DEC, a novel method to improve local scale equivariance, which can be integrated into existing models and pre-trained networks.
Findings
DEC improves model performance on ImageNet.
DEC enhances local scale consistency across different architectures.
Applicable to pre-trained models like ViT, DeiT, Swin, and BEiT.
Abstract
Scale variation is a fundamental challenge in computer vision. Objects of the same class can have different sizes, and their perceived size is further affected by the distance from the camera. These variations are local to the objects, i.e., different object sizes may change differently within the same image. To effectively handle scale variations, we present a deep equilibrium canonicalizer (DEC) to improve the local scale equivariance of a model. DEC can be easily incorporated into existing network architectures and can be adapted to a pre-trained model. Notably, we show that on the competitive ImageNet benchmark, DEC improves both model performance and local scale consistency across four popular pre-trained deep-nets, e.g., ViT, DeiT, Swin, and BEiT. Our code is available at https://github.com/ashiq24/local-scale-equivariance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Image Retrieval and Classification Techniques · Generative Adversarial Networks and Image Synthesis
