Discover Life Skills for Planning with Bandits via Observing and   Learning How the World Works

Tin Lai

arXiv:2207.08130·cs.AI·July 19, 2022

Discover Life Skills for Planning with Bandits via Observing and Learning How the World Works

Tin Lai

PDF

Open Access

TL;DR

This paper introduces a planning framework that learns high-level skills by observing state transitions and using bandit algorithms to evaluate and improve plans in complex, noisy environments.

Contribution

It presents a novel method combining skill learning with bandit-based evaluation, enabling autonomous high-level planning without explicit pre-condition knowledge.

Findings

01

Effective in high-dimensional state spaces

02

Automatically learns action pre-conditions

03

Robust performance in noisy environments

Abstract

We propose a novel approach for planning agents to compose abstract skills via observing and learning from historical interactions with the world. Our framework operates in a Markov state-space model via a set of actions under unknown pre-conditions. We formulate skills as high-level abstract policies that propose action plans based on the current state. Each policy learns new plans by observing the states' transitions while the agent interacts with the world. Such an approach automatically learns new plans to achieve specific intended effects, but the success of such plans is often dependent on the states in which they are applicable. Therefore, we formulate the evaluation of such plans as infinitely many multi-armed bandit problems, where we balance the allocation of resources on evaluating the success probability of existing arms and exploring new options. The result is a planner…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Machine Learning and Algorithms · Topic Modeling