BridgeEQA: Virtual Embodied Agents for Real Bridge Inspections
Subin Varghese, Joshua Gao, Asad Ur Rahman, Vedhus Hoskere

TL;DR
BridgeEQA introduces a new benchmark and evaluation metric for embodied agents performing real-world bridge inspections, emphasizing multi-scale reasoning and long-range spatial understanding.
Contribution
The paper presents BridgeEQA, a novel benchmark with professional inspection reports and a new metric, along with EMVR, a method that improves embodied question answering in infrastructure inspection scenarios.
Findings
State-of-the-art models show significant performance gaps on BridgeEQA.
EMVR outperforms baseline models in the inspection EQA task.
BridgeEQA provides a standardized platform for evaluating embodied agents in real-world inspections.
Abstract
Deploying embodied agents that can answer questions about their surroundings in realistic real-world settings remains difficult, partly due to the scarcity of benchmarks for episodic memory Embodied Question Answering (EQA). Inspired by the challenges of infrastructure inspections, we propose Inspection EQA as a compelling problem class for advancing episodic memory EQA. It demands multi-scale reasoning and long-range spatial understanding, while offering standardized evaluation, professional inspection reports as grounding, and egocentric imagery. We introduce BridgeEQA, a benchmark of 2,200 open-vocabulary question-answer pairs (in the style of OpenEQA) grounded in professional inspection reports across 200 real-world bridge scenes with 47.93 images on average per scene. We further propose a new EQA metric Image Citation Relevance to evaluate the ability of a model to cite relevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
