StairNetV3: Depth-aware Stair Modeling using Deep Learning
Chen Wang, Zhongcai Pei, Shuang Qiu, Yachun Wang, Zhiyong Tang

TL;DR
This paper introduces a depth-aware monocular vision method for accurate stair modeling in autonomous robots, combining CNN-based geometric feature extraction with depth information for improved reconstruction and real-time performance.
Contribution
It proposes a novel joint CNN framework for stair geometric features and depth prediction, enhancing monocular stair modeling accuracy and speed.
Findings
Achieved a 3.4% increase in IOU over previous methods.
Developed a lightweight model suitable for real-time applications.
Provided a new dataset for stair perception research.
Abstract
Vision-based stair perception can help autonomous mobile robots deal with the challenge of climbing stairs, especially in unfamiliar environments. To address the problem that current monocular vision methods are difficult to model stairs accurately without depth information, this paper proposes a depth-aware stair modeling method for monocular vision. Specifically, we take the extraction of stair geometric features and the prediction of depth images as joint tasks in a convolutional neural network (CNN), with the designed information propagation architecture, we can achieve effective supervision for stair geometric feature learning by depth information. In addition, to complete the stair modeling, we take the convex lines, concave lines, tread surfaces and riser surfaces as stair geometric features and apply Gaussian kernels to enable the network to predict contextual information within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Prosthetics and Rehabilitation Robotics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
