Toward Selectivity Based Keyword Extraction for Croatian News

Slobodan Beliga; Ana Me\v{s}trovi\'c; Sanda; Martin\v{c}i\'c-Ip\v{s}i\'c

arXiv:1407.4723·cs.CL·February 15, 2018·22 cites

Toward Selectivity Based Keyword Extraction for Croatian News

Slobodan Beliga, Ana Me\v{s}trovi\'c, Sanda, Martin\v{c}i\'c-Ip\v{s}i\'c

PDF

Open Access

TL;DR

This paper introduces a novel unsupervised network-based keyword extraction method for Croatian news, utilizing a new measure called node selectivity to identify keywords without linguistic knowledge.

Contribution

The paper proposes a new network measure, node selectivity, for keyword extraction that is purely statistical and structural, avoiding linguistic dependencies.

Findings

01

F1 score for keywords: 24.63%

02

F2 score for keywords: 21.19%

03

F1 score for word-tuples: 25.9%

Abstract

Preliminary report on network based keyword extraction for Croatian is an unsupervised method for keyword extraction from the complex network. We build our approach with a new network measure the node selectivity, motivated by the research of the graph based centrality approaches. The node selectivity is defined as the average weight distribution on the links of the single node. We extract nodes (keyword candidates) based on the selectivity value. Furthermore, we expand extracted nodes to word-tuples ranked with the highest in/out selectivity values. Selectivity based extraction does not require linguistic knowledge while it is purely derived from statistical and structural information en-compassed in the source text which is reflected into the structure of the network. Obtained sets are evaluated on a manually annotated keywords: for the set of extracted keyword candidates average F1…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Web Data Mining and Analysis · Service-Oriented Architecture and Web Services