Learning a Machine for the Decision in a Partially Observable Markov   Universe

Frederic Dambreville (DGA/CTA/DT/GIP)

arXiv:math/0408146·math.GM·May 23, 2007·3 cites

Learning a Machine for the Decision in a Partially Observable Markov Universe

Frederic Dambreville (DGA/CTA/DT/GIP)

PDF

Open Access

TL;DR

This paper introduces a method for learning optimal decision strategies in partially observable Markov environments by approximating strategic trees with parameterized hidden Markov models and optimizing them using the cross-entropy principle.

Contribution

It proposes a novel approach that directly approximates strategic decision trees with parameterized HMMs and introduces a cross-entropy based optimization method for these models.

Findings

01

Effective approximation of strategic trees in POMDPs

02

Successful application of cross-entropy optimization to HMM parameters

03

Improved decision-making performance in partially observable environments

Abstract

In this paper, we are interested in optimal decisions in a partially observable Markov universe. Our viewpoint departs from the dynamic programming viewpoint: we are directly approximating an optimal strategic tree depending on the observation. This approximation is made by means of a parameterized probabilistic law. In this paper, a particular family of hidden Markov models, with input and output, is considered as a learning framework. A method for optimizing the parameters of these HMMs is proposed and applied. This optimization method is based on the cross-entropic principle.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Reinforcement Learning in Robotics