Effects of Word Embeddings on Neural Network-based Pitch Accent   Detection

Sabrina Stehwien; Ngoc Thang Vu; Antje Schweitzer

arXiv:1805.05237·cs.CL·June 8, 2018

Effects of Word Embeddings on Neural Network-based Pitch Accent Detection

Sabrina Stehwien, Ngoc Thang Vu, Antje Schweitzer

PDF

TL;DR

This paper investigates how incorporating word embeddings into a neural network-based pitch accent detection system affects its performance, revealing improvements within the same corpus but challenges in generalizing across different datasets.

Contribution

It introduces the integration of word embeddings into a neural pitch accent detector and evaluates their impact on within-corpus and cross-corpus performance.

Findings

01

Word embeddings improve within-corpus accuracy.

02

They can hinder cross-corpus generalization.

03

Embeddings have mixed effects depending on the dataset.

Abstract

Pitch accent detection often makes use of both acoustic and lexical features based on the fact that pitch accents tend to correlate with certain words. In this paper, we extend a pitch accent detector that involves a convolutional neural network to include word embeddings, which are state-of-the-art vector representations of words. We examine the effect these features have on within-corpus and cross-corpus experiments on three English datasets. The results show that while word embeddings can improve the performance in corpus-dependent experiments, they also have the potential to make generalization to unseen data more challenging.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.