Entities as topic labels: Improving topic interpretability and   evaluability combining Entity Linking and Labeled LDA

Federico Nanni; Pablo Ruiz Fabo

arXiv:1604.07809·cs.CL·April 27, 2016

Entities as topic labels: Improving topic interpretability and evaluability combining Entity Linking and Labeled LDA

Federico Nanni, Pablo Ruiz Fabo

PDF

Open Access

TL;DR

This paper introduces a method combining Entity Linking and Labeled LDA to produce more interpretable and evaluable topics by associating them with ontology-based labels, demonstrated on European Parliament data.

Contribution

It proposes a novel approach that links topics to ontology-derived labels, enhancing interpretability and evaluation of topic models.

Findings

01

Topics are more interpretable with clear labels.

02

Ontology limits label ambiguity and improves relevance.

03

Method applied successfully to political corpus.

Abstract

In order to create a corpus exploration method providing topics that are easier to interpret than standard LDA topic models, here we propose combining two techniques called Entity linking and Labeled LDA. Our method identifies in an ontology a series of descriptive labels for each document in a corpus. Then it generates a specific topic for each label. Having a direct relation between topics and labels makes interpretation easier; using an ontology as background knowledge limits label ambiguity. As our topics are described with a limited number of clear-cut labels, they promote interpretability, and this may help quantitative evaluation. We illustrate the potential of the approach by applying it in order to define the most relevant topics addressed by each party in the European Parliament's fifth mandate (1999-2004).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Text and Document Classification Technologies · Sentiment Analysis and Opinion Mining

MethodsLinear Discriminant Analysis