Garden-Path Traversal in GPT-2
William Jurayj, William Rudman, Carsten Eickhoff

TL;DR
This paper introduces new methods for analyzing GPT-2's internal states, focusing on garden path sentence navigation, revealing insights into how the model handles ambiguity and the limitations of traditional surprisal measures.
Contribution
It presents novel analysis techniques for transformer decoder hidden states and applies them to a large dataset of garden path sentences, uncovering nuanced model behaviors.
Findings
Manhattan distances and cosine similarities outperform surprisal in analysis.
Negating tokens minimally affect representations in certain ambiguous sentences.
Hidden state analysis reveals ambiguity periods that surprisal misses.
Abstract
In recent years, large-scale transformer decoders such as the GPT-x family of models have become increasingly popular. Studies examining the behavior of these models tend to focus only on the output of the language modeling head and avoid analysis of the internal states of the transformer decoder. In this study, we present a collection of methods to analyze the hidden states of GPT-2 and use the model's navigation of garden path sentences as a case study. To enable this, we compile the largest currently available dataset of garden path sentences. We show that Manhattan distances and cosine similarities provide more reliable insights compared to established surprisal methods that analyze next-token probabilities computed by a language modeling head. Using these methods, we find that negating tokens have minimal impacts on the model's representations for unambiguous forms of sentences…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Layer Normalization · Byte Pair Encoding · Weight Decay · Dense Connections · Dropout · Cosine Annealing
