3D-EX : A Unified Dataset of Definitions and Dictionary Examples
Fatemah Almeman, Hadi Sheikhi, Luis Espinosa-Anke

TL;DR
The paper introduces 3D-EX, a comprehensive dataset combining English definitions and examples to support NLP tasks, with pre-defined splits for evaluation and demonstrated utility in downstream applications.
Contribution
It presents a unified, well-structured dataset of definitions and examples, filling a resource gap and enabling consistent evaluation in NLP research.
Findings
Effective in downstream NLP tasks
Pre-computed train/validation/test splits
Potential to improve lexical semantics modeling
Abstract
Definitions are a fundamental building block in lexicography, linguistics and computational semantics. In NLP, they have been used for retrofitting word embeddings or augmenting contextual representations in language models. However, lexical resources containing definitions exhibit a wide range of properties, which has implications in the behaviour of models trained and evaluated on them. In this paper, we introduce 3D- EX , a dataset that aims to fill this gap by combining well-known English resources into one centralized knowledge repository in the form of <term, definition, example> triples. 3D- EX is a unified evaluation framework with carefully pre-computed train/validation/test splits to prevent memorization. We report experimental results that suggest that this dataset could be effectively leveraged in downstream NLP tasks. Code and data are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
