Improving Equivariance in State-of-the-Art Supervised Depth and Normal   Predictors

Yuanyi Zhong; Anand Bhattad; Yu-Xiong Wang; David Forsyth

arXiv:2309.16646·cs.CV·October 18, 2023

Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors

Yuanyi Zhong, Anand Bhattad, Yu-Xiong Wang, David Forsyth

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper identifies that current state-of-the-art depth and normal predictors lack cropping-and-resizing equivariance and proposes a regularization method to explicitly enforce this property, improving their accuracy and robustness.

Contribution

The authors introduce an equivariant regularization technique that enhances cropping-and-resizing equivariance in depth and normal predictors across CNN and Transformer models.

Findings

01

Improved equivariance in depth and normal predictions.

02

Enhanced accuracy on Taskonomy and NYU-v2 datasets.

03

Applicable to both supervised and semi-supervised learning.

Abstract

Dense depth and surface normal predictors should possess the equivariant property to cropping-and-resizing -- cropping the input image should result in cropping the same output image. However, we find that state-of-the-art depth and normal predictors, despite having strong performances, surprisingly do not respect equivariance. The problem exists even when crop-and-resize data augmentation is employed during training. To remedy this, we propose an equivariant regularization technique, consisting of an averaging procedure and a self-consistency loss, to explicitly promote cropping-and-resizing equivariance in depth and normal networks. Our approach can be applied to both CNN and Transformer architectures, does not incur extra cost during testing, and notably improves the supervised and semi-supervised learning performance of dense predictors on Taskonomy tasks. Finally, finetuning with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mikuhatsune/equivariance
pytorchOfficial

Videos

Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors· youtube

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Human Pose and Action Recognition

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Absolute Position Encodings · Dense Connections · Layer Normalization · Byte Pair Encoding