Identifying a Circuit for Verb Conjugation in GPT-2
David Demitri Africa

TL;DR
This paper identifies and interprets a specific sub-network within GPT-2 Small that is responsible for subject-verb agreement, revealing how a small part of the model can perform this grammatical task effectively.
Contribution
The study introduces a method to isolate and analyze the circuit responsible for verb conjugation in GPT-2 Small, advancing understanding of model interpretability.
Findings
A small sub-network significantly influences verb conjugation.
Near-model performance achieved with only a fraction of network components.
More complex tasks require larger portions of the network.
Abstract
I implement a procedure to isolate and interpret the sub-network (or "circuit") responsible for subject-verb agreement in GPT-2 Small. In this study, the model is given prompts where the subject is either singular (e.g. "Alice") or plural (e.g. "Alice and Bob"), and the task is to correctly predict the appropriate verb form ("walks" for singular subjects, "walk" for plural subjects). Using a series of techniques-including performance verification automatic circuit discovery via direct path patching, and direct logit attribution- I isolate a candidate circuit that contributes significantly to the model's correct verb conjugation. The results suggest that only a small fraction of the network's component-token pairs is needed to achieve near-model performance on the base task but substantially more for more complex settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education
