Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical   Core-Fringe Approach

Jie Huang; Kevin Chen-Chuan Chang; Jinjun Xiong; Wen-mei Hwu

arXiv:2105.13255·cs.CL·May 28, 2021

Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach

Jie Huang, Kevin Chen-Chuan Chang, Jinjun Xiong, Wen-mei Hwu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a hierarchical core-fringe approach to measure fine-grained domain relevance of terms, leveraging a semantic graph and semi-supervised learning to outperform baselines and human experts.

Contribution

It presents a novel hierarchical core-fringe learning method that accurately assesses term relevance across broad and narrow domains without extensive supervision.

Findings

01

Outperforms strong baseline methods.

02

Surpasses professional human performance.

03

Effective for both large and small domains.

Abstract

We propose to measure fine-grained domain relevance - the degree that a term is relevant to a broad (e.g., computer science) or narrow (e.g., deep learning) domain. Such measurement is crucial for many downstream tasks in natural language processing. To handle long-tail terms, we build a core-anchored semantic graph, which uses core terms with rich description information to bridge the vast remaining fringe terms semantically. To support a fine-grained domain without relying on a matching corpus for supervision, we develop hierarchical core-fringe learning, which learns core and fringe terms jointly in a semi-supervised manner contextualized in the hierarchy of the domain. To reduce expensive human efforts, we employ automatic annotation and hierarchical positive-unlabeled learning. Our approach applies to big or small domains, covers head or tail terms, and requires little human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jeffhj/domain-relevance
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies