Understanding or Memorizing? A Case Study of German Definite Articles in Language Models
Jonathan Drechsel, Erisa Bytyqi, Steffen Herbold

TL;DR
This study investigates whether language models understand German definite articles through rules or memorization, finding evidence that they rely partly on memorized associations rather than purely rule-based encoding.
Contribution
The paper introduces a gradient-based interpretability method to analyze parameter updates, revealing that models rely on memorization rather than strict rule-based encoding for German articles.
Findings
Parameter updates for specific gender-case transitions affect unrelated settings.
Neurons involved in article transitions overlap across different gender-case settings.
Results suggest reliance on memorization over rule-based generalization.
Abstract
Language models perform well on grammatical agreement, but it is unclear whether this reflects rule-based generalization or memorization. We study this question for German definite singular articles, whose forms depend on gender and case. Using GRADIEND, a gradient-based interpretability method, we learn parameter update directions for gender-case specific article transitions. We find that updates learned for a specific gender-case article transition frequently affect unrelated gender-case settings, with substantial overlap among the most affected neurons across settings. These results argue against a strictly rule-based encoding of German definite articles, indicating that models at least partly rely on memorized associations rather than abstract grammatical rules.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
