# Experiments in Cuneiform Language Identification

**Authors:** Gustavo Henrique Paetzold, Marcos Zampieri

arXiv: 1904.12087 · 2019-04-30

## TL;DR

This paper explores methods for identifying languages and dialects written in Cuneiform, demonstrating a machine learning approach that achieved competitive results in a shared task evaluation.

## Contribution

It introduces a meta-classifier approach using SVM models for Cuneiform language identification, applied to a challenging dataset of ancient languages.

## Key findings

- Achieved 0.738 F1 score in language/dialect discrimination
- Ranked fourth among eight teams in the shared task
- Proved effectiveness of the meta-classifier approach

## Abstract

This paper presents methods to discriminate between languages and dialects written in Cuneiform script, one of the first writing systems in the world. We report the results obtained by the PZ team in the Cuneiform Language Identification (CLI) shared task organized within the scope of the VarDial Evaluation Campaign 2019. The task included two languages, Sumerian and Akkadian. The latter is divided into six dialects: Old Babylonian, Middle Babylonian peripheral, Standard Babylonian, Neo Babylonian, Late Babylonian, and Neo Assyrian. We approach the task using a meta-classifier trained on various SVM models and we show the effectiveness of the system for this task. Our submission achieved 0.738 F1 score in discriminating between the seven languages and dialects and it was ranked fourth in the competition among eight teams.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.12087/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1904.12087/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1904.12087/full.md

---
Source: https://tomesphere.com/paper/1904.12087