Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech   Separation

Jian Luo; Jianzong Wang; Ning Cheng; Edward Xiao; Xulong Zhang; Jing; Xiao

arXiv:2206.13689·cs.SD·July 1, 2022

Tiny-Sepformer: A Tiny Time-Domain Transformer Network for Speech Separation

Jian Luo, Jianzong Wang, Ning Cheng, Edward Xiao, Xulong Zhang, Jing, Xiao

PDF

Open Access

TL;DR

Tiny-Sepformer is a compact Transformer-based speech separation model that significantly reduces parameters and memory usage while maintaining performance comparable to larger models.

Contribution

The paper introduces a novel Tiny-Sepformer architecture with convolution-attention blocks and parameter sharing to create a lightweight speech separation model.

Findings

01

Achieves comparable performance to larger models on WSJ0-2/3Mix datasets.

02

Reduces model size and memory consumption significantly.

03

Maintains separation quality with fewer parameters.

Abstract

Time-domain Transformer neural networks have proven their superiority in speech separation tasks. However, these models usually have a large number of network parameters, thus often encountering the problem of GPU memory explosion. In this paper, we proposed Tiny-Sepformer, a tiny version of Transformer network for speech separation. We present two techniques to reduce the model parameters and memory consumption: (1) Convolution-Attention (CA) block, spliting the vanilla Transformer to two paths, multi-head attention and 1D depthwise separable convolution, (2) parameter sharing, sharing the layer parameters within the CA block. In our experiments, Tiny-Sepformer could greatly reduce the model size, and achieves comparable separation performance with vanilla Sepformer on WSJ0-2/3Mix datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing