Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences
Mark Anderson, Jose Camacho-Collados

TL;DR
This paper investigates how distributional semantic models capture trait-based relational knowledge in English and Spanish, examining the influence of co-occurrence patterns on the interpretability of semantic spaces.
Contribution
It evaluates the extent to which semantic models encode trait-based relations and analyzes the impact of co-occurrence data on this encoding across two languages.
Findings
Semantic spaces partially encode trait-based relations.
Co-occurrence patterns significantly influence relational knowledge capture.
Differences observed between English and Spanish semantic representations.
Abstract
The increase in performance in NLP due to the prevalence of distributional models and deep learning has brought with it a reciprocal decrease in interpretability. This has spurred a focus on what neural networks learn about natural language with less of a focus on how. Some work has focused on the data used to develop data-driven models, but typically this line of work aims to highlight issues with the data, e.g. highlighting and offsetting harmful biases. This work contributes to the relatively untrodden path of what is required in data for models to capture meaningful representations of natural language. This entails evaluating how well English and Spanish semantic spaces capture a particular type of relational knowledge, namely the traits associated with concepts (e.g. bananas-yellow), and exploring the role of co-occurrences in this context.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Biomedical Text Mining and Ontologies · Topic Modeling
