Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making
Andrew Butcher, Michael Bradley Johanson, Elnaz Davoodi, Dylan J. A., Brenneis, Leslie Acker, Adam S. R. Parker, Adam White, Joseph Modayil,, Patrick M. Pilarski

TL;DR
This paper explores Pavlovian signalling as a mechanism for adaptive communication between learning agents, demonstrating its effectiveness in a novel decision-making domain and analyzing its impact on coordination and timing.
Contribution
It introduces Pavlovian signalling as a bridge between fixed signals and adaptive communication, showing how to build it from prediction learning with minimal constraints.
Findings
Pavlovian signalling accelerates learning in agent interactions.
Temporal representations influence coordination but not the speed of learning.
Temporal aliasing affects human-agent and agent-agent interactions differently.
Abstract
In this paper, we contribute a multi-faceted study into Pavlovian signalling -- a process by which learned, temporally extended predictions made by one agent inform decision-making by another agent. Signalling is intimately connected to time and timing. In service of generating and receiving signals, humans and other animals are known to represent time, determine time since past events, predict the time until a future stimulus, and both recognize and generate patterns that unfold in time. We investigate how different temporal processes impact coordination and signalling between learning agents by introducing a partially observable decision-making domain we call the Frost Hollow. In this domain, a prediction learning agent and a reinforcement learning agent are coupled into a two-part decision-making system that works to acquire sparse reward while avoiding time-conditional hazards. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing
Methodstravel james · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
