Foundation models on the bridge: Semantic hazard detection and safety maneuvers for maritime autonomy with vision-language models
Kim Alexander Christensen, Andreas Gudahl Tufte, Alexey Gusev, Rohan Sinha, Milan Ganai, Ole Andreas Alsos, Marco Pavone, Martin Steinert

TL;DR
This paper presents Semantic Lookout, a vision-language model-based fallback system for maritime vessels that enhances semantic awareness and safety maneuvering within practical latency constraints, aligning with IMO regulations.
Contribution
It introduces a novel semantic fallback maneuver selector using vision-language models for maritime autonomy, demonstrating improved safety and compliance in harbor scenarios.
Findings
Models retain most awareness with sub-10s latency
Outperforms geometry-only baselines in hazard detection
Field run verifies end-to-end operational feasibility
Abstract
The draft IMO MASS Code requires autonomous and remotely supervised maritime vessels to detect departures from their operational design domain, enter a predefined fallback that notifies the operator, permit immediate human override, and avoid changing the voyage plan without approval. Meeting these obligations in the alert-to-takeover gap calls for a short-horizon, human-overridable fallback maneuver. Classical maritime autonomy stacks struggle when the correct action depends on meaning (e.g., diver-down flag means people in the water, fire close by means hazard). We argue (i) that vision-language models (VLMs) provide semantic awareness for such out-of-distribution situations, and (ii) that a fast-slow anomaly pipeline with a short-horizon, human-overridable fallback maneuver makes this practical in the handover window. We introduce Semantic Lookout, a camera-only,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMaritime Navigation and Safety · Multimodal Machine Learning Applications · Human-Automation Interaction and Safety
