Derivational Probing: Unveiling the Layer-wise Derivation of Syntactic Structures in Neural Language Models
Taiga Someya, Ryo Yoshida, Hitomi Yanaka, Yohei Oseki

TL;DR
This paper introduces Derivational Probing, a method to analyze how neural language models like BERT build syntactic structures across layers, revealing a bottom-up derivation process crucial for understanding model behavior.
Contribution
It presents a novel probing technique to dissect layer-wise syntactic derivations in neural models, highlighting the progression from micro- to macro-syntactic structures.
Findings
Micro-syntactic structures emerge in lower layers.
Macro-syntactic structures are integrated in higher layers.
Timing of structure construction impacts downstream tasks.
Abstract
Recent work has demonstrated that neural language models encode syntactic structures in their internal representations, yet the derivations by which these structures are constructed across layers remain poorly understood. In this paper, we propose Derivational Probing to investigate how micro-syntactic structures (e.g., subject noun phrases) and macro-syntactic structures (e.g., the relationship between the root verbs and their direct dependents) are constructed as word embeddings propagate upward across layers. Our experiments on BERT reveal a clear bottom-up derivation: micro-syntactic structures emerge in lower layers and are gradually integrated into a coherent macro-syntactic structure in higher layers. Furthermore, a targeted evaluation on subject-verb number agreement shows that the timing of constructing macro-syntactic structures is critical for downstream performance,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
