Dictionary-Based Concept Mining: An Application for Turkish

Cem R{\i}fk{\i} Ayd{\i}n; Ali Erkan; Tunga G\"ung\"or; and Hidayet; Tak\c{c}{\i}

arXiv:1401.2663·cs.CL·January 14, 2014·1 cites

Dictionary-Based Concept Mining: An Application for Turkish

Cem R{\i}fk{\i} Ayd{\i}n, Ali Erkan, Tunga G\"ung\"or, and Hidayet, Tak\c{c}{\i}

PDF

Open Access

TL;DR

This paper presents a dictionary-based concept mining method tailored for Turkish, an agglutinative language, demonstrating high success in extracting expressive concepts from diverse document corpora.

Contribution

It introduces a novel dictionary-based approach for concept mining in Turkish, addressing the gap due to limited use of dictionaries compared to WordNet.

Findings

01

High success rate in concept extraction from Turkish documents

02

Effective use of dictionary relationships like synonyms and hypernyms

03

Applicable across different corpora

Abstract

In this study, a dictionary-based method is used to extract expressive concepts from documents. So far, there have been many studies concerning concept mining in English, but this area of study for Turkish, an agglutinative language, is still immature. We used dictionary instead of WordNet, a lexical database grouping words into synsets that is widely used for concept extraction. The dictionaries are rarely used in the domain of concept mining, but taking into account that dictionary entries have synonyms, hypernyms, hyponyms and other relationships in their meaning texts, the success rate has been high for determining concepts. This concept extraction method is implemented on documents, that are collected from different corpora.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Advanced Text Analysis Techniques