Exploring Semantic Capacity of Terms
Jie Huang, Zilong Wang, Kevin Chen-Chuan Chang, Wen-mei Hwu, Jinjun, Xiong

TL;DR
This paper introduces a model to quantify the semantic capacity of terms based on their co-occurrence in large text corpora, aiding natural language processing tasks.
Contribution
It proposes a novel two-step model for evaluating semantic capacity of terms, validated through extensive experiments across multiple fields.
Findings
Model effectively measures semantic capacity of terms.
Experiments show alignment with human judgments.
Model outperforms baseline approaches.
Abstract
We introduce and study semantic capacity of terms. For example, the semantic capacity of artificial intelligence is higher than that of linear regression since artificial intelligence possesses a broader meaning scope. Understanding semantic capacity of terms will help many downstream tasks in natural language processing. For this purpose, we propose a two-step model to investigate semantic capacity of terms, which takes a large text corpus as input and can evaluate semantic capacity of terms if the text corpus can provide enough co-occurrence information of terms. Extensive experiments in three fields demonstrate the effectiveness and rationality of our model compared with well-designed baselines and human-level evaluations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
MethodsLinear Regression
