NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation
Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan

TL;DR
This paper introduces a novel neural window fully-connected CRFs approach for monocular depth estimation, leveraging multi-head attention within windowed regions to improve accuracy and computational efficiency.
Contribution
It proposes a new method combining neural window FC-CRFs with multi-head attention and a transformer-based encoder-decoder structure for better depth estimation.
Findings
Significant performance improvements on KITTI and NYUv2 datasets.
Effective application to panorama images, outperforming previous methods.
Reduced computational complexity enabling fully-connected CRFs.
Abstract
Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed. While recent works design increasingly complicated and powerful networks to directly regress the depth map, we take the path of CRFs optimization. Due to the expensive computation, CRFs are usually performed between neighborhoods rather than the whole graph. To leverage the potential of fully-connected CRFs, we split the input into windows and perform the FC-CRFs optimization within each window, which reduces the computation complexity and makes FC-CRFs feasible. To better capture the relationships between nodes in the graph, we exploit the multi-head attention mechanism to compute a multi-head potential function, which is fed to the networks to output an optimized depth map. Then we build a bottom-up-top-down structure, where this neural window FC-CRFs module serves as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Optical measurement and interference techniques
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Residual Connection · Layer Normalization · Dense Connections · Vision Transformer
