Fast Top-k Area Topics Extraction with Knowledge Base

Fang Zhang; Xiaochen Wang; Jingfei Han; Jie Tang; Shiyin Wang,; Marie-Francine Moens

arXiv:1710.04822·cs.AI·December 5, 2017

Fast Top-k Area Topics Extraction with Knowledge Base

Fang Zhang, Xiaochen Wang, Jingfei Han, Jie Tang, Shiyin Wang,, Marie-Francine Moens

PDF

Open Access

TL;DR

This paper introduces FastKATE, a novel method for efficiently extracting top-k representative research topics in AI using knowledge base embeddings, with proven effectiveness and real-time performance.

Contribution

It formulates the top-k topic extraction as an NP-hard problem and proposes a fast heuristic algorithm leveraging knowledge base embeddings, demonstrating superior results.

Findings

01

Effective in real-world datasets

02

Returns results in less than 1 second

03

Outperforms several alternative methods

Abstract

What are the most popular research topics in Artificial Intelligence (AI)? We formulate the problem as extracting top- $k$ topics that can best represent a given area with the help of knowledge base. We theoretically prove that the problem is NP-hard and propose an optimization model, FastKATE, to address this problem by combining both explicit and latent representations for each topic. We leverage a large-scale knowledge base (Wikipedia) to generate topic embeddings using neural networks and use this kind of representations to help capture the representativeness of topics for given areas. We develop a fast heuristic algorithm to efficiently solve the problem with a provable error bound. We evaluate the proposed model on three real-world datasets. Experimental results demonstrate our model's effectiveness, robustness, real-timeness (return results in $< 1$ s), and its superiority over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsWeb Data Mining and Analysis · Advanced Computational Techniques and Applications · Time Series Analysis and Forecasting