Commonsense Visual Sensemaking for Autonomous Driving: On Generalised Neurosymbolic Online Abduction Integrating Vision and Semantics
Jakob Suchan, Mehul Bhatt, Srikrishna Varadarajan

TL;DR
This paper presents a neurosymbolic framework combining vision and semantics for real-time visual sensemaking in autonomous driving, emphasizing explainability and human-centered AI considerations.
Contribution
It introduces a formalized, modular neurosymbolic method using answer set programming for online visual sensemaking, applicable to autonomous driving and other cognitive interaction domains.
Findings
Effective integration of vision and semantics demonstrated on benchmarks
Framework supports explainability and question-answering in safety-critical scenarios
Domain-independent approach adaptable to diverse AI applications
Abstract
We demonstrate the need and potential of systematically integrated vision and semantics solutions for visual sensemaking in the backdrop of autonomous driving. A general neurosymbolic method for online visual sensemaking using answer set programming (ASP) is systematically formalised and fully implemented. The method integrates state of the art in visual computing, and is developed as a modular framework that is generally usable within hybrid architectures for realtime perception and control. We evaluate and demonstrate with community established benchmarks KITTIMOD, MOT-2017, and MOT-2020. As use-case, we focus on the significance of human-centred visual sensemaking -- e.g., involving semantic representation and explainability, question-answering, commonsense interpolation -- in safety-critical autonomous driving situations. The developed neurosymbolic framework is domain-independent,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
