Graph-attention-based Casual Discovery with Trust Region-navigated   Clipping Policy Optimization

Shixuan Liu; Yanghe Feng; Keyu Wu; Guangquan Cheng; Jincai Huang,; Zhong Liu

arXiv:2412.19578·cs.LG·December 30, 2024

Graph-attention-based Casual Discovery with Trust Region-navigated Clipping Policy Optimization

Shixuan Liu, Yanghe Feng, Keyu Wu, Guangquan Cheng, Jincai Huang,, Zhong Liu

PDF

TL;DR

This paper introduces a novel reinforcement learning approach with trust region navigation and a refined graph attention encoder for more robust and efficient causal structure discovery, outperforming previous methods on synthetic and benchmark datasets.

Contribution

It proposes a trust region-navigated clipping policy optimization and a new SDGAT encoder to enhance causal discovery performance and stability.

Findings

01

Outperforms previous RL methods in synthetic datasets.

02

Achieves better robustness and efficiency in causal structure learning.

03

Demonstrates superior results on benchmark datasets.

Abstract

In many domains of empirical sciences, discovering the causal structure within variables remains an indispensable task. Recently, to tackle with unoriented edges or latent assumptions violation suffered by conventional methods, researchers formulated a reinforcement learning (RL) procedure for causal discovery, and equipped REINFORCE algorithm to search for the best-rewarded directed acyclic graph. The two keys to the overall performance of the procedure are the robustness of RL methods and the efficient encoding of variables. However, on the one hand, REINFORCE is prone to local convergence and unstable performance during training. Neither trust region policy optimization, being computationally-expensive, nor proximal policy optimization (PPO), suffering from aggregate constraint deviation, is decent alternative for combinatory optimization problems with considerable individual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoftmax · Attention Is All You Need · REINFORCE · Entropy Regularization · Proximal Policy Optimization