Visual Semantic Planning using Deep Successor Representations
Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav, Gupta, Roozbeh Mottaghi, Ali Farhadi

TL;DR
This paper introduces a deep successor representation-based model for visual semantic planning, enabling agents to predict action sequences from visual inputs to achieve goals in dynamic environments, with strong generalization and near-optimal performance.
Contribution
It presents a novel deep predictive model combining reinforcement and imitation learning for visual planning, emphasizing cross-task generalization in dynamic environments.
Findings
Achieves near-optimal performance in THOR environment tasks
Demonstrates effective cross-task generalization
Integrates reinforcement and imitation learning successfully
Abstract
A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world. In this work, we address the problem of visual semantic planning: the task of predicting a sequence of actions from visual observations that transform a dynamic environment from an initial state to a goal state. Doing so entails knowledge about objects and their affordances, as well as actions and their preconditions and effects. We propose learning these through interacting with a visual and dynamic environment. Our proposed solution involves bootstrapping reinforcement learning with imitation learning. To ensure cross task generalization, we develop a deep predictive model based on successor representations. Our experimental results show near optimal results across a wide range of tasks in the challenging THOR environment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications
