Memory Injections: Correcting Multi-Hop Reasoning Failures during   Inference in Transformer-Based Language Models

Mansi Sakarvadia; Aswathy Ajith; Arham Khan; Daniel Grzenda; Nathaniel; Hudson; Andr\'e Bauer; Kyle Chard; Ian Foster

arXiv:2309.05605·cs.CL·March 1, 2024

Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

Mansi Sakarvadia, Aswathy Ajith, Arham Khan, Daniel Grzenda, Nathaniel, Hudson, Andr\'e Bauer, Kyle Chard, Ian Foster

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces a method to improve multi-hop reasoning in transformer-based language models by injecting targeted memories into attention layers, significantly enhancing their reasoning accuracy during inference.

Contribution

It presents a novel memory injection technique that corrects reasoning failures in LLMs by adding prompt-specific information at critical attention points.

Findings

01

Memory injections can increase correct token prediction probability by up to 424%.

02

Targeted memory injections improve multi-hop reasoning performance.

03

Analysis of GPT-2 activations guides effective memory placement.

Abstract

Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

msakarvadia/memory_injections
jaxOfficial

Datasets

msakarvadia/handwritten_multihop_reasoning_data
dataset· 11 dl
11 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Layer Normalization · Linear Layer · Dense Connections · Attention Dropout · Residual Connection · Discriminative Fine-Tuning · Adam