A two step algorithm for learning from unspecific reinforcement

Reimer Kuehn; Ion-Olimpiu Stamatescu

arXiv:cond-mat/9902354·cond-mat.stat-mech·October 31, 2009

A two step algorithm for learning from unspecific reinforcement

Reimer Kuehn, Ion-Olimpiu Stamatescu

PDF

TL;DR

This paper introduces a two-step learning algorithm based on the Hebb rule to handle delayed, unspecific reinforcement, demonstrating convergence to perfect generalization under certain conditions despite feedback ambiguity.

Contribution

It proposes a novel two-step algorithm for learning from unspecific reinforcement and analyzes its convergence properties and dependence on parameters.

Findings

01

Convergence to perfect generalization observed despite unspecific feedback.

02

Learning rate influences convergence speed and success.

03

Initial conditions affect whether the system achieves optimal generalization.

Abstract

We study a simple learning model based on the Hebb rule to cope with "delayed", unspecific reinforcement. In spite of the unspecific nature of the information-feedback, convergence to asymptotically perfect generalization is observed, with a rate depending, however, in a non- universal way on learning parameters. Asymptotic convergence can be as fast as that of Hebbian learning, but may be slower. Moreover, for a certain range of parameter settings, it depends on initial conditions whether the system can reach the regime of asymptotically perfect generalization, or rather approaches a stationary state of poor generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.