Macedonian Speech Synthesis for Assistive Technology Applications

Bojan Sofronievski; Elena Velovska; Martin Velichkovski; Violeta; Argirova; Tea Veljkovikj; Risto Chavdarov; Stefan Janev; Kristijan Lazarev,; Toni Bachvarovski; Zoran Ivanovski; Dimitar Tashkovski; Branislav Gerazov

arXiv:2205.09198·eess.AS·June 6, 2023

Macedonian Speech Synthesis for Assistive Technology Applications

Bojan Sofronievski, Elena Velovska, Martin Velichkovski, Violeta, Argirova, Tea Veljkovikj, Risto Chavdarov, Stefan Janev, Kristijan Lazarev,, Toni Bachvarovski, Zoran Ivanovski, Dimitar Tashkovski, Branislav Gerazov

PDF

Open Access

TL;DR

This paper develops and compares parametric and deep learning speech synthesis models for Macedonian, aiming to support assistive communication tools for low-resource language scenarios, and finds parametric methods are resource-efficient with comparable quality.

Contribution

It introduces Macedonian speech synthesis models using both parametric and deep learning techniques, tailored for assistive technology applications in low-resource settings.

Findings

01

Parametric synthesis performs comparably to deep learning models in listening tests.

02

Parametric models require fewer resources and allow full control over speech parameters.

03

The study provides a new Macedonian speech corpus for TTS development.

Abstract

Speech technology is becoming ever more ubiquitous with the advance of speech enabled devices and services. The use of speech synthesis in Augmentative and Alternative Communication tools, has facilitated inclusion of individuals with speech impediments allowing them to communicate with their surroundings using speech. Although there are numerous speech synthesis systems for the most spoken world languages, there is still a limited offer for smaller languages. We propose and compare three models built using parametric and deep learning techniques for Macedonian trained on a newly recorded corpus. We target low-resource edge deployment for Augmentative and Alternative Communication and assistive technologies, such as communication boards and screen readers. The listening test results show that parametric speech synthesis is as performant compared to the more advanced deep learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and dialogue systems