Word-level Embeddings for Cross-Task Transfer Learning in Speech   Processing

Pierre Beckmann; Mikolaj Kegler; Milos Cernak

arXiv:1910.09909·cs.CL·December 15, 2021

Word-level Embeddings for Cross-Task Transfer Learning in Speech Processing

Pierre Beckmann, Mikolaj Kegler, Milos Cernak

PDF

2 Repos

TL;DR

This paper introduces a word-level speech encoder trained for cross-task transfer learning, demonstrating its effectiveness across diverse speech processing tasks and outperforming or matching task-specific methods.

Contribution

The paper presents a novel pre-trained encoder for word-level speech representations that enables effective cross-task transfer learning in speech processing.

Findings

01

Pre-trained encoder improves performance across multiple speech tasks.

02

Simple application of the encoder often outperforms task-specific methods.

03

Representation transferability is validated across different datasets.

Abstract

Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. In recent years, unsupervised and self-supervised techniques for learning speech representation were developed to foster automatic speech recognition. Up to date, most of these approaches are task-specific and designed for within-task transfer learning between different datasets or setups of a particular task. In turn, learning task-independent representation of speech and cross-task applications of transfer learning remain less common. Here, we introduce an encoder capturing word-level representations of speech for cross-task transfer learning. We demonstrate the application of the pre-trained encoder in four distinct speech and audio processing tasks: (i) speech enhancement, (ii) language identification, (iii) speech, noise, and music classification, and (iv) speaker identification. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.