Measuring Context-Word Biases in Lexical Semantic Datasets
Qianchu Liu, Diana McCarthy, Anna Korhonen

TL;DR
This paper analyzes how current lexical semantic datasets may be biased towards either context or word cues, revealing that models often do not test true word-in-context understanding as humans do, and proposes a framework to measure and visualize these biases.
Contribution
It introduces a novel quantitative framework to analyze and visualize context-word biases in lexical semantic datasets, highlighting differences between models and humans.
Findings
Models exhibit strong bias towards either context or word cues in datasets.
Humans perform better with both context and word available, showing less bias.
Proposed measures help understand and control dataset biases for better model evaluation.
Abstract
State-of-the-art pretrained contextualized models (PCM) eg. BERT use tasks such as WiC and WSD to evaluate their word-in-context representations. This inherently assumes that performance in these tasks reflect how well a model represents the coupled word and context semantics. We question this assumption by presenting the first quantitative analysis on the context-word interaction being tested in major contextual lexical semantic tasks. To achieve this, we run probing baselines on masked input, and propose measures to calculate and visualize the degree of context or word biases in existing datasets. The analysis was performed on both models and humans. Our findings demonstrate that models are usually not being tested for word-in-context semantics in the same way as humans are in these tasks, which helps us better understand the model-human gap. Specifically, to PCMs, most existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Attention Dropout · Linear Warmup With Linear Decay · Residual Connection · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization
