LiftFormer: Lifting and Frame Theory Based Monocular Depth Estimation Using Depth and Edge Oriented Subspace Representation

Shuai Li; Huibin Bai; Yanbo Gao; Chong Lv; Hui Yuan; Chuankun Li; Wei Hua; Tian Xie

arXiv:2604.06576·cs.CV·April 9, 2026

LiftFormer: Lifting and Frame Theory Based Monocular Depth Estimation Using Depth and Edge Oriented Subspace Representation

Shuai Li, Huibin Bai, Yanbo Gao, Chong Lv, Hui Yuan, Chuankun Li, Wei Hua, Tian Xie

PDF

TL;DR

LiftFormer introduces a novel subspace-based approach for monocular depth estimation, leveraging lifting theory to improve depth prediction accuracy, especially around edges, achieving state-of-the-art results.

Contribution

The paper proposes a LiftFormer model that constructs depth and edge-oriented subspaces using lifting theory, enhancing geometric depth prediction from monocular images.

Findings

01

Achieves state-of-the-art performance on standard datasets.

02

Validates effectiveness of depth and edge subspaces through ablation study.

03

Transforms image features into a robust geometric subspace for depth estimation.

Abstract

Monocular depth estimation (MDE) has attracted increasing interest in the past few years, owing to its important role in 3D vision. MDE is the estimation of a depth map from a monocular image/video to represent the 3D structure of a scene, which is a highly ill-posed problem. To solve this problem, in this paper, we propose a LiftFormer based on lifting theory topology, for constructing an intermediate subspace that bridges the image color features and depth values, and a subspace that enhances the depth prediction around edges. MDE is formulated by transforming the depth value prediction problem into depth-oriented geometric representation (DGR) subspace feature representation, thus bridging the learning from color values to geometric depth values. A DGR subspace is constructed based on frame theory by using linearly dependent vectors in accordance with depth bins to provide a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.