SANPO: A Scene Understanding, Accessibility and Human Navigation Dataset
Sagar M. Waghmare, Kimberly Wilber, Dave Hawkey, Xuan Yang, Matthew, Wilson, Stephanie Debats, Cattalyya Nuengsigkapian, Astuti Sharma, Lars, Pandikow, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko

TL;DR
SANPO is a comprehensive outdoor egocentric video dataset with dense annotations, designed to advance assistive navigation technologies for visually impaired individuals by providing real and synthetic data for training and evaluation.
Contribution
We introduce SANPO, a large-scale, annotated egocentric video dataset for outdoor human navigation, filling a critical gap in datasets for assistive vision technologies.
Findings
SANPO contains 701 real-world stereo videos with dense panoptic segmentation.
The dataset includes 1961 synthetic videos with high-quality annotations.
SANPO is already aiding mobile models for assistive navigation applications.
Abstract
Vision is essential for human navigation. The World Health Organization (WHO) estimates that 43.3 million people were blind in 2020, and this number is projected to reach 61 million by 2050. Modern scene understanding models could empower these people by assisting them with navigation, obstacle avoidance and visual recognition capabilities. The research community needs high quality datasets for both training and evaluation to build these systems. While datasets for autonomous vehicles are abundant, there is a critical gap in datasets tailored for outdoor human navigation. This gap poses a major obstacle to the development of computer vision based Assistive Technologies. To overcome this obstacle, we present SANPO, a large-scale egocentric video dataset designed for dense prediction in outdoor human navigation environments. SANPO contains 701 stereo videos of 30+ seconds captured in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Human Pose and Action Recognition
