EMR-MSF: Self-Supervised Recurrent Monocular Scene Flow Exploiting Ego-Motion Rigidity
Zijie Jiang, Masatoshi Okutomi

TL;DR
This paper introduces EMR-MSF, a self-supervised monocular scene flow model that leverages ego-motion rigidity and geometric constraints to significantly improve accuracy, rivaling supervised methods.
Contribution
The paper proposes a novel ego-motion aggregation module with a rigidity soft mask and new loss functions, enhancing self-supervised scene flow estimation from monocular images.
Findings
Outperforms previous self-supervised methods by 44% on KITTI benchmark
Achieves performance comparable to supervised approaches
Improves depth and visual odometry tasks significantly
Abstract
Self-supervised monocular scene flow estimation, aiming to understand both 3D structures and 3D motions from two temporally consecutive monocular images, has received increasing attention for its simple and economical sensor setup. However, the accuracy of current methods suffers from the bottleneck of less-efficient network architecture and lack of motion rigidity for regularization. In this paper, we propose a superior model named EMR-MSF by borrowing the advantages of network architecture design under the scope of supervised learning. We further impose explicit and robust geometric constraints with an elaborately constructed ego-motion aggregation module where a rigidity soft mask is proposed to filter out dynamic regions for stable ego-motion estimation using static regions. Moreover, we propose a motion consistency loss along with a mask regularization loss to fully exploit static…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Robotics and Sensor-Based Localization
