Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation

Yijun Cao; Fuya Luo; and Yongjie Li

arXiv:2506.04758·cs.CV·June 6, 2025

Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation

Yijun Cao, Fuya Luo, and Yongjie Li

PDF

TL;DR

This paper introduces a new form of SSIM loss for unsupervised monocular depth estimation, replacing multiplication with addition to improve gradient smoothness and performance.

Contribution

It proposes a novel SSIM formulation that enhances training stability and accuracy in unsupervised depth learning, with extensive parameter optimization and empirical validation.

Findings

01

New SSIM form outperforms traditional in depth estimation tasks

02

Optimized parameters significantly improve KITTI-2015 results

03

Smoother gradients lead to better training stability

Abstract

Unsupervised monocular depth learning generally relies on the photometric relation among temporally adjacent images. Most of previous works use both mean absolute error (MAE) and structure similarity index measure (SSIM) with conventional form as training loss. However, they ignore the effect of different components in the SSIM function and the corresponding hyperparameters on the training. To address these issues, this work proposes a new form of SSIM. Compared with original SSIM function, the proposed new form uses addition rather than multiplication to combine the luminance, contrast, and structural similarity related components in SSIM. The loss function constructed with this scheme helps result in smoother gradients and achieve higher performance on unsupervised depth estimation. We conduct extensive experiments to determine the relatively optimal combination of parameters for our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.