ALL Dolphins Are Intelligent and SOME Are Friendly: Probing BERT for Nouns' Semantic Properties and their Prototypicality
Marianna Apidianaki, Aina Gar\'i Soler

TL;DR
This paper investigates BERT's understanding of nouns' semantic properties and prototypicality, revealing limited knowledge in zero-shot settings but improved reasoning with fine-tuning on entailment tasks.
Contribution
It introduces a novel probing approach for noun properties using psycholinguistic datasets and demonstrates BERT's enhanced reasoning capabilities after fine-tuning.
Findings
BERT has limited zero-shot knowledge of noun semantic features.
Fine-tuning improves BERT's ability to reason about adjective-noun constructions.
Evaluation of semantic property knowledge is challenging due to dataset and task limitations.
Abstract
Large scale language models encode rich commonsense knowledge acquired through exposure to massive data during pre-training, but their understanding of entities and their semantic properties is unclear. We probe BERT (Devlin et al., 2019) for the properties of English nouns as expressed by adjectives that do not restrict the reference scope of the noun they modify (as in "red car"), but instead emphasise some inherent aspect ("red strawberry"). We base our study on psycholinguistics datasets that capture the association strength between nouns and their semantic features. We probe BERT using cloze tasks and in a classification setting, and show that the model has marginal knowledge of these features and their prevalence as expressed in these datasets. We discuss factors that make evaluation challenging and impede drawing general conclusions about the models' knowledge of noun properties.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · WordPiece · Adam · Dense Connections · Softmax · Dropout · Layer Normalization · Attention Dropout
