TL;DR
This paper introduces a novel approach that combines natural language descriptions and image data to estimate scene danger levels, enabling safer and more effective multi-robot search and rescue planning.
Contribution
It presents a new method for danger estimation using visio-linguistic deep learning and integrates it into a risk-aware multi-robot path planning framework for SaR missions.
Findings
The framework improves safety by avoiding high-risk areas.
The approach achieves high success rates in simulated disaster scenarios.
It demonstrates effective integration of natural language and image data for risk assessment.
Abstract
The ability to develop a high-level understanding of a scene, such as perceiving danger levels, can prove valuable in planning multi-robot search and rescue (SaR) missions. In this work, we propose to uniquely leverage natural language descriptions from the mission commander in chief and image data captured by robots to estimate scene danger. Given a description and an image, a state-of-the-art deep neural network is used to assess a corresponding similarity score, which is then converted into a probabilistic distribution of danger levels. Because commonly used visio-linguistic datasets do not represent SaR missions well, we collect a large-scale image-description dataset from synthetic images taken from realistic disaster scenes and use it to train our machine learning model. A risk-aware variant of the Multi-robot Efficient Search Path Planning (MESPP) problem is then formulated to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
