Q-learning with temporal memory to navigate turbulence
Marco Rando, Martin James, Alessandro Verri, Lorenzo Rosasco, Agnese, Seminara

TL;DR
This paper develops a reinforcement learning approach with temporal memory for odor-based navigation in turbulent environments, enabling agents to learn robust strategies similar to insect behavior.
Contribution
Introduces a novel RL algorithm with interpretable olfactory states and temporal memory for navigation without spatial perception in turbulent odor plumes.
Findings
Temporal memory improves navigation performance.
Optimal strategies involve cross-wind casting.
Robustness to environmental changes.
Abstract
We consider the problem of olfactory searches in a turbulent environment. We focus on agents that respond solely to odor stimuli, with no access to spatial perception nor prior information about the odor. We ask whether navigation to a target can be learned robustly within a sequential decision making framework. We develop a reinforcement learning algorithm using a small set of interpretable olfactory states and train it with realistic turbulent odor cues. By introducing a temporal memory, we demonstrate that two salient features of odor traces, discretized in few olfactory states, are sufficient to learn navigation in a realistic odor plume. Performance is dictated by the sparse nature of turbulent odors. An optimal memory exists which ignores blanks within the plume and activates a recovery strategy outside the plume. We obtain the best performance by letting agents learn their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training · Focus
