Real-Time Hybrid Mapping of Populated Indoor Scenes using a Low-Cost Monocular UAV
Stuart Golodetz, Madhu Vankadari, Aluna Everitt, Sangyun Shin, Andrew, Markham, Niki Trigoni

TL;DR
This paper introduces the first real-time system that combines dense indoor scene mapping with multi-person 3D human pose estimation using a low-cost monocular UAV, enabling navigation and interaction in tight, populated indoor environments.
Contribution
It presents a novel system that loosely couples monocular depth and pose estimation for simultaneous indoor mapping and multi-person 3D pose estimation from a UAV.
Findings
Successful real-time hybrid mapping and pose estimation demonstrated
Validated component choices on large-scale datasets
Constructed a new dataset for indoor hybrid mapping
Abstract
Unmanned aerial vehicles (UAVs) have been used for many applications in recent years, from urban search and rescue, to agricultural surveying, to autonomous underground mine exploration. However, deploying UAVs in tight, indoor spaces, especially close to humans, remains a challenge. One solution, when limited payload is required, is to use micro-UAVs, which pose less risk to humans and typically cost less to replace after a crash. However, micro-UAVs can only carry a limited sensor suite, e.g. a monocular camera instead of a stereo pair or LiDAR, complicating tasks like dense mapping and markerless multi-person 3D human pose estimation, which are needed to operate in tight environments around people. Monocular approaches to such tasks exist, and dense monocular mapping approaches have been successfully deployed for UAV applications. However, despite many recent works on both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Advanced Vision and Imaging
MethodsAttentive Walk-Aggregating Graph Neural Network
