Multi-Hop Knowledge Graph Reasoning with Reward Shaping
Xi Victoria Lin, Richard Socher, Caiming Xiong

TL;DR
This paper introduces a reinforcement learning approach for multi-hop knowledge graph reasoning that mitigates false negatives and spurious paths, leading to improved question-answering performance on benchmarks.
Contribution
It proposes two novel techniques: using a pretrained embedding model for reward estimation and employing random edge masks to promote diverse exploration.
Findings
Significant performance improvements on benchmark datasets.
Comparable or superior to embedding-based models.
Enhanced robustness against false negatives and spurious paths.
Abstract
Multi-hop reasoning is an effective approach for query answering (QA) over incomplete knowledge graphs (KGs). The problem can be formulated in a reinforcement learning (RL) setup, where a policy-based agent sequentially extends its inference path until it reaches a target. However, in an incomplete KG environment, the agent receives low-quality rewards corrupted by false negatives in the training data, which harms generalization at test time. Furthermore, since no golden action sequence is used for training, the agent can be misled by spurious search trajectories that incidentally lead to the correct answer. We propose two modeling advances to address both issues: (1) we reduce the impact of false negative supervision by adopting a pretrained one-hop embedding model to estimate the reward of unobserved facts; (2) we counter the sensitivity to spurious paths of on-policy RL by forcing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Multimodal Machine Learning Applications
