Improving Parametric Knowledge Access in Reasoning Language Models
Melody Ma, John Hewitt

TL;DR
This paper explores how reasoning prompts and reinforcement learning can enhance language models' ability to access and recall world knowledge stored in their parameters, improving performance on knowledge-based tasks.
Contribution
It demonstrates that simple reasoning cues improve knowledge recall and introduces a reinforcement learning approach to train models for better parametric knowledge access.
Findings
Adding 'think step-by-step' cues improves knowledge recall.
Reinforcement learning on TriviaQA enhances performance on multiple datasets.
Models can be easily trained to reason better for world knowledge access.
Abstract
We study reasoning for accessing world knowledge stored in a language model's parameters. For example, recalling that Canberra is Australia's capital may benefit from thinking through major cities and the concept of purpose-built capitals. While reasoning language models are trained via reinforcement learning to produce reasoning traces on tasks such as mathematics, they may not reason well for accessing their own world knowledge. We first find that models do not generate their best world knowledge reasoning by default: adding a simple "think step-by-step" cue demonstrates statistically significant improvement in knowledge recall but not math. Motivated by this, we propose training models to reason over their parametric knowledge using world-knowledge question answering as a verifiable reward. After reinforcement learning on TriviaQA (+9.9%), performance also improves on Natural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Graph Neural Networks
