A Dynamic Data Driven Approach for Explainable Scene Understanding
Zachary A Daniels, Dimitris Metaxas

TL;DR
This paper presents ACUMEN, a dynamic, explanation-driven framework enabling an active agent to efficiently classify and understand scenes by adaptively adjusting sensors, handling unknown categories, and updating its knowledge base in real-time.
Contribution
The paper introduces ACUMEN, a novel framework for active, explanation-driven scene classification that incorporates sensor adjustment, unknown scene handling, and real-time learning.
Findings
Demonstrated effectiveness in indoor scene classification with a robotic agent.
Showed the framework's ability to handle unknown scene categories.
Validated the approach through a case study with vision-based sensors.
Abstract
Scene-understanding is an important topic in the area of Computer Vision, and illustrates computational challenges with applications to a wide range of domains including remote sensing, surveillance, smart agriculture, robotics, autonomous driving, and smart cities. We consider the active explanation-driven understanding and classification of scenes. Suppose that an agent utilizing one or more sensors is placed in an unknown environment, and based on its sensory input, the agent needs to assign some label to the perceived scene. The agent can adjust its sensor(s) to capture additional details about the scene, but there is a cost associated with sensor manipulation, and as such, it is important for the agent to understand the scene in a fast and efficient manner. It is also important that the agent understand not only the global state of a scene (e.g., the category of the scene or the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Vision and Imaging · Multimodal Machine Learning Applications
MethodsBalanced Selection
