Learning Lifted Action Models from Unsupervised Visual Traces
Kai Xi, Stephen Gould, Sylvie Thi\'ebaux

TL;DR
This paper presents a deep learning framework that learns lifted action models from sequences of state images without action labels, using MILP-based correction to improve consistency and accuracy.
Contribution
It introduces a novel combination of deep learning and MILP optimization to learn action models from visual data without supervision.
Findings
MILP correction improves model convergence to consistent solutions
The framework successfully learns action models from image sequences in multiple domains
Integrating MILP helps prevent prediction collapse and self-reinforcing errors
Abstract
Efficient construction of models capturing the preconditions and effects of actions is essential for applying AI planning in real-world domains. Extensive prior work has explored learning such models from high-level descriptions of state and/or action sequences. In this paper, we tackle a more challenging setting: learning lifted action models from sequences of state images, without action observation. We propose a deep learning framework that jointly learns state prediction, action prediction, and a lifted action model. We also introduce a mixed-integer linear program (MILP) to prevent prediction collapse and self-reinforcing errors among predictions. The MILP takes the predicted states, actions, and action model over a subset of traces and solves for logically consistent states, actions, and action model that are as close as possible to the original predictions. Pseudo-labels…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
