Shaping Proto-Value Functions via Rewards
Chandrashekar Lakshmi Narayanan, Raj Kumar Maity, Shalabh Bhatnagar

TL;DR
This paper introduces reward dependent proto-value functions (RPVFs) that integrate reward shaping with proto-value functions, improving learning especially in symmetrical state spaces with asymmetrical rewards.
Contribution
It presents a novel method combining reward shaping and PVFs to create RPVFs that better capture reward asymmetries during learning.
Findings
RPVFs outperform PVFs and reward shaping alone in experiments.
RPVFs better capture asymmetries in reward distribution.
Learning efficiency improves with RPVFs in symmetrical state spaces.
Abstract
In this paper, we combine task-dependent reward shaping and task-independent proto-value functions to obtain reward dependent proto-value functions (RPVFs). In constructing the RPVFs we are making use of the immediate rewards which are available during the sampling phase but are not used in the PVF construction. We show via experiments that learning with an RPVF based representation is better than learning with just reward shaping or PVFs. In particular, when the state space is symmetrical and the rewards are asymmetrical, the RPVF capture the asymmetry better than the PVFs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Neural Networks and Applications
