Looks can be Deceptive: Distinguishing Repetition Disfluency from   Reduplication

Arif Ahmad; Mothika Gayathri Khyathi; Pushpak Bhattacharyya

arXiv:2407.08147·cs.CL·July 12, 2024

Looks can be Deceptive: Distinguishing Repetition Disfluency from Reduplication

Arif Ahmad, Mothika Gayathri Khyathi, Pushpak Bhattacharyya

PDF

Open Access

TL;DR

This study distinguishes between reduplication and repetition in speech, introducing a new dataset and evaluating transformer models that effectively classify these phenomena across multiple Indian languages.

Contribution

It provides the first large-scale computational analysis of reduplication and repetition, along with a new multilingual dataset and a classification approach using transformer models.

Findings

01

Models achieved macro F1 scores above 83% in all three languages.

02

The dataset enables detailed linguistic analysis of reduplication and repetition.

03

Transformer-based models effectively distinguish between the two phenomena.

Abstract

Reduplication and repetition, though similar in form, serve distinct linguistic purposes. Reduplication is a deliberate morphological process used to express grammatical, semantic, or pragmatic nuances, while repetition is often unintentional and indicative of disfluency. This paper presents the first large-scale study of reduplication and repetition in speech using computational linguistics. We introduce IndicRedRep, a new publicly available dataset containing Hindi, Telugu, and Marathi text annotated with reduplication and repetition at the word level. We evaluate transformer-based models for multi-class reduplication and repetition token classification, utilizing the Reparandum-Interregnum-Repair structure to distinguish between the two phenomena. Our models achieve macro F1 scores of up to 85.62% in Hindi, 83.95% in Telugu, and 84.82% in Marathi for reduplication-repetition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace Recognition and Perception