Relational Schemata in BERT Are Inducible, Not Emergent: A Study of Performance vs. Competence in Language Models
Cole Gawin

TL;DR
This paper investigates whether BERT encodes relational schemata inherently or through training, finding that such structures are inducible with task-specific fine-tuning rather than emergent from pretraining alone.
Contribution
The study demonstrates that relational schemata in BERT are not naturally emergent but can be induced through supervised fine-tuning, clarifying the nature of conceptual competence in language models.
Findings
High classification accuracy indicates latent relational signals in BERT.
Relational organization in embeddings appears only after fine-tuning.
Relational schemata are inducible, not inherently emergent from pretraining.
Abstract
While large language models like BERT demonstrate strong empirical performance on semantic tasks, whether this reflects true conceptual competence or surface-level statistical association remains unclear. I investigate whether BERT encodes abstract relational schemata by examining internal representations of concept pairs across taxonomic, mereological, and functional relations. I compare BERT's relational classification performance with representational structure in [CLS] token embeddings. Results reveal that pretrained BERT enables high classification accuracy, indicating latent relational signals. However, concept pairs organize by relation type in high-dimensional embedding space only after fine-tuning on supervised relation classification tasks. This indicates relational schemata are not emergent from pretraining alone but can be induced via task scaffolding. These findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
