Multi-view Dimensionality Reduction for Dialect Identification of Arabic   Broadcast Speech

Sameer Khurana; Ahmed Ali; Steve Renals

arXiv:1609.05650·cs.CL·September 20, 2016·2 cites

Multi-view Dimensionality Reduction for Dialect Identification of Arabic Broadcast Speech

Sameer Khurana, Ahmed Ali, Steve Renals

PDF

Open Access

TL;DR

This paper introduces a multi-view dimensionality reduction method using CCA to combine phonotactic and acoustic features for improved Arabic dialect identification from broadcast speech.

Contribution

It presents a novel feature space combination approach using CCA for dialect identification, outperforming single-view methods and offering an alternative to model-based fusion.

Findings

01

CCA-based feature vectors outperform individual phonetic and acoustic features.

02

The combined approach improves dialect identification accuracy.

03

The method provides a viable alternative to model-based fusion techniques.

Abstract

In this work, we present a new Vector Space Model (VSM) of speech utterances for the task of spoken dialect identification. Generally, DID systems are built using two sets of features that are extracted from speech utterances; acoustic and phonetic. The acoustic and phonetic features are used to form vector representations of speech utterances in an attempt to encode information about the spoken dialects. The Phonotactic and Acoustic VSMs, thus formed, are used for the task of DID. The aim of this paper is to construct a single VSM that encodes information about spoken dialects from both the Phonotactic and Acoustic VSMs. Given the two views of the data, we make use of a well known multi-view dimensionality reduction technique known as Canonical Correlation Analysis (CCA), to form a single vector representation for each speech utterance that encodes dialect specific discriminative…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Phonetics and Phonology Research