Annotating Cognates and Etymological Origin in Turkic Languages

Benjamin S. Mericli; Michael Bloodgood

arXiv:1501.03191·cs.CL·January 15, 2015·2 cites

Annotating Cognates and Etymological Origin in Turkic Languages

Benjamin S. Mericli, Michael Bloodgood

PDF

Open Access

TL;DR

This paper introduces a methodology for annotating cognates and etymological origins in Turkic languages to facilitate automated translation lexicon induction, balancing annotation effort with research utility.

Contribution

It presents a novel annotation approach tailored for Turkic languages, addressing the challenge of diverse etymological relationships for computational applications.

Findings

01

Proposed annotation methodology for Turkic languages

02

Balanced effort and utility in annotation process

03

Potential to improve automated translation lexicon induction

Abstract

Turkic languages exhibit extensive and diverse etymological relationships among lexical items. These relationships make the Turkic languages promising for exploring automated translation lexicon induction by leveraging cognate and other etymological information. However, due to the extent and diversity of the types of relationships between words, it is not clear how to annotate such information. In this paper, we present a methodology for annotating cognates and etymological origin in Turkic languages. Our method strives to balance the amount of research effort the annotator expends with the utility of the annotations for supporting research on improving automated translation lexicon induction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Lexicography and Language Studies