TL;DR
This paper introduces an unsupervised model combining self-organizing maps and causal sequence prediction that enables an agent to learn physical dynamics through exploration and perform one-shot imitation in a cartpole environment.
Contribution
It presents a novel unsupervised approach that models physical properties and causal sequences, allowing flexible one-shot imitation and planning, differing from traditional supervised deep learning methods.
Findings
Agent learns physical dynamics via exploration
Performs effective one-shot imitation tasks
Capable of future state simulation for planning
Abstract
Human learning and intelligence work differently from the supervised pattern recognition approach adopted in most deep learning architectures. Humans seem to learn rich representations by exploration and imitation, build causal models of the world, and use both to flexibly solve new tasks. We suggest a simple but effective unsupervised model which develops such characteristics. The agent learns to represent the dynamical physical properties of its environment by intrinsically motivated exploration, and performs inference on this representation to reach goals. For this, a set of self-organizing maps which represent state-action pairs is combined with a causal model for sequence prediction. The proposed system is evaluated in the cartpole environment. After an initial phase of playful exploration, the agent can execute kinematic simulations of the environment's future, and use those for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
