Probing Implicit Bias in Semi-gradient Q-learning: Visualizing the   Effective Loss Landscapes via the Fokker--Planck Equation

Shuyu Yin; Fei Wen; Peilin Liu; Tao Luo

arXiv:2406.08148·cs.LG·June 13, 2024

Probing Implicit Bias in Semi-gradient Q-learning: Visualizing the Effective Loss Landscapes via the Fokker--Planck Equation

Shuyu Yin, Fei Wen, Peilin Liu, Tao Luo

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method using the Fokker--Planck equation to visualize and analyze the implicit bias and loss landscape transformations in semi-gradient Q-learning, revealing the nature of saddle points and minima.

Contribution

It develops a new approach to probe implicit bias in semi-gradient Q-learning by visualizing effective loss landscapes via the Fokker--Planck equation.

Findings

01

Global minima can become saddle points in the effective loss landscape.

02

Saddle points originating from global minima persist in high-dimensional neural networks.

03

The method provides insights into the implicit bias of semi-gradient Q-learning.

Abstract

Semi-gradient Q-learning is applied in many fields, but due to the absence of an explicit loss function, studying its dynamics and implicit bias in the parameter space is challenging. This paper introduces the Fokker--Planck equation and employs partial data obtained through sampling to construct and visualize the effective loss landscape within a two-dimensional parameter space. This visualization reveals how the global minima in the loss landscape can transform into saddle points in the effective loss landscape, as well as the implicit bias of the semi-gradient method. Additionally, we demonstrate that saddle points, originating from the global minima in loss landscape, still exist in the effective loss landscape under high-dimensional parameter spaces and neural network settings. This paper develop a novel approach for probing implicit bias in semi-gradient Q-learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dayhost/fpe
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsQ-Learning