The Role of Target Update Frequencies in Q-Learning
Simon Weissmann, Tilman Aach, Benedikt Wille, Sebastian Kassing, Leif D\"oring

TL;DR
This paper provides a theoretical analysis of target update frequencies in Q-learning, revealing how adaptive schedules outperform constant ones by reducing sample complexity and improving convergence.
Contribution
It introduces a principled, finite-time convergence analysis of target update frequencies, showing how to optimally set and adapt this hyperparameter during learning.
Findings
Optimal target update frequency increases geometrically during training.
Constant update schedules incur unnecessary logarithmic sample complexity overhead.
Adaptive schedules significantly improve sample efficiency and convergence.
Abstract
The target network update frequency (TUF) is a central stabilization mechanism in (deep) Q-learning. However, their selection remains poorly understood and is often treated merely as another tunable hyperparameter rather than as a principled design decision. This work provides a theoretical analysis of target fixing in tabular Q-learning through the lens of approximate dynamic programming. We formulate periodic target updates as a nested optimization scheme in which each outer iteration applies an inexact Bellman optimality operator, approximated by a generic inner loop optimizer. Rigorous theory yields a finite-time convergence analysis for the asynchronous sampling setting, specializing to stochastic gradient descent in the inner loop. Our results deliver an explicit characterization of the bias-variance trade-off induced by the target update period, showing how to optimally set this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · Stochastic Gradient Optimization Techniques · Reinforcement Learning in Robotics
