Modelling prosodic structure using Artificial Neural Networks

Jean-Philippe Bernardy; Charalambos Themistocleous

arXiv:1706.03952·cs.CL·June 16, 2017

Modelling prosodic structure using Artificial Neural Networks

Jean-Philippe Bernardy, Charalambos Themistocleous

PDF

TL;DR

This paper compares LSTM and ConvNet architectures for classifying Cypriot Greek questions and statements, demonstrating that ConvNet achieves 95% accuracy and better handles tonal variation in prosodic structure.

Contribution

It introduces a neural network-based classification model for prosodic tonal patterns in Cypriot Greek, highlighting the superior performance of ConvNet over LSTM.

Findings

01

ConvNet achieved 95% classification accuracy.

02

ConvNet outperformed LSTM in tonal classification.

03

Neural networks effectively model prosodic tonal variation.

Abstract

The ability to accurately perceive whether a speaker is asking a question or is making a statement is crucial for any successful interaction. However, learning and classifying tonal patterns has been a challenging task for automatic speech recognition and for models of tonal representation, as tonal contours are characterized by significant variation. This paper provides a classification model of Cypriot Greek questions and statements. We evaluate two state-of-the-art network architectures: a Long Short-Term Memory (LSTM) network and a convolutional network (ConvNet). The ConvNet outperforms the LSTM in the classification task and exhibited an excellent performance with 95% classification accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory