Continuously evolving rewards in an open-ended environment
Richard M. Bailey

TL;DR
This paper introduces RULE, an algorithm for dynamically updating rewards in open-ended environments, enabling agents to adapt their behaviors by endogenous reward modification during continuous learning.
Contribution
The paper presents RULE, a novel method for agents to autonomously update their reward functions in complex environments, improving adaptability and behavioral evolution.
Findings
Agents successfully abandoned detrimental behaviors.
Beneficial behaviors were amplified through reward updates.
Agents responded appropriately to environmental changes.
Abstract
Unambiguous identification of the rewards driving behaviours of entities operating in complex open-ended real-world environments is difficult, partly because goals and associated behaviours emerge endogenously and are dynamically updated as environments change. Reproducing such dynamics in models would be useful in many domains, particularly where fixed reward functions limit the adaptive capabilities of agents. Simulation experiments described assess a candidate algorithm for the dynamic updating of rewards, RULE: Reward Updating through Learning and Expectation. The approach is tested in a simplified ecosystem-like setting where experiments challenge entities' survival, calling for significant behavioural change. The population of entities successfully demonstrate the abandonment of an initially rewarded but ultimately detrimental behaviour, amplification of beneficial behaviour, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Educational Games and Gamification
