Multi-Modal Citizen Science: From Disambiguation to Transcription of   Classical Literature

Maryam Foradi; Jan Ka{\ss}el; Johannes Pein; Gregory R. Crane

arXiv:1909.12622·cs.CL·September 30, 2019

Multi-Modal Citizen Science: From Disambiguation to Transcription of Classical Literature

Maryam Foradi, Jan Ka{\ss}el, Johannes Pein, Gregory R. Crane

PDF

Open Access

TL;DR

This paper explores multi-modal citizen science in Digital Humanities, focusing on adding audio annotations to classical Persian poetry to enhance engagement, comprehension, and language learning, with scalable quality assessment.

Contribution

It introduces audio annotation tasks with difficulty levels for classical literature, enabling non-Persian speakers to contribute and assess annotation quality without ground truth data.

Findings

01

Audio annotations enrich classical literature corpora.

02

Users with varying Persian proficiency can contribute effectively.

03

Difficulty levels help estimate annotation accuracy.

Abstract

The engagement of citizens in the research projects, including Digital Humanities projects, has risen in prominence in recent years. This type of engagement not only leads to incidental learning of participants but also indicates the added value of corpus enrichment via different types of annotations undertaken by users generating so-called smart texts. Our work focuses on the continuous task of adding new layers of annotation to Classical Literature. We aim to provide more extensive tools for readers of smart texts, enhancing their reading comprehension and at the same time empowering the language learning by introducing intellectual tasks, i.e., linking, tagging, and disambiguation. The current study adds a new mode of annotation-audio annotations-to the extensively annotated corpus of poetry by the Persian poet Hafiz. By proposing tasks with three different difficulty levels, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Topic Modeling · Music and Audio Processing