Domain Representative Keywords Selection: A Probabilistic Approach

Pritom Saha Akash; Jie Huang; Kevin Chen-Chuan Chang; Yunyao Li,; Lucian Popa; ChengXiang Zhai

arXiv:2203.10365·cs.CL·June 7, 2022

Domain Representative Keywords Selection: A Probabilistic Approach

Pritom Saha Akash, Jie Huang, Kevin Chen-Chuan Chang, Yunyao Li,, Lucian Popa, ChengXiang Zhai

PDF

1 Repo

TL;DR

This paper introduces a probabilistic method for selecting representative keywords that distinguish a target domain from a context domain, improving keyword summarization and trending keyword detection in NLP tasks.

Contribution

It presents a novel two-component mixture model and an efficient optimization algorithm for selecting distinctive, representative keywords with proven near-optimal approximation guarantees.

Findings

01

Outperforms baseline methods in keyword summary tasks

02

Effective in trending keywords selection across multiple domains

03

Demonstrates computational efficiency and high accuracy

Abstract

We propose a probabilistic approach to select a subset of a \textit{target domain representative keywords} from a candidate set, contrasting with a context domain. Such a task is crucial for many downstream tasks in natural language processing. To contrast the target domain and the context domain, we adapt the \textit{two-component mixture model} concept to generate a distribution of candidate keywords. It provides more importance to the \textit{distinctive} keywords of the target domain than common keywords contrasting with the context domain. To support the \textit{representativeness} of the selected keywords towards the target domain, we introduce an \textit{optimization algorithm} for selecting the subset from the generated candidate distribution. We have shown that the optimization algorithm can be efficiently implemented with a near-optimal approximation guarantee. Finally,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pritomsaha/keyword-selection
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.