An automatic mixing speech enhancement system for multi-track audio

Xiaojing Liu; Hongwei Ai; Joshua D. Reiss

arXiv:2404.17821·cs.SD·October 22, 2024·1 cites

An automatic mixing speech enhancement system for multi-track audio

Xiaojing Liu, Hongwei Ai, Joshua D. Reiss

PDF

Open Access

TL;DR

This paper introduces an automatic speech enhancement system for multitrack audio that reduces auditory masking and improves clarity in multi-speaker scenarios, evaluated through perceptual quality metrics and listening tests.

Contribution

It presents a novel iterative harmony search-based method for applying audio effects to minimize masking, outperforming existing auto-mixing systems and rivaling professional mixes.

Findings

01

System effectively reduces auditory masking.

02

Outperforms existing auto-mixing systems in listening tests.

03

Achieves comparable quality to professional sound engineers' mixes.

Abstract

We propose a speech enhancement system for multitrack audio. The system will minimize auditory masking while allowing one to hear multiple simultaneous speakers. The system can be used in multiple communication scenarios e.g., teleconferencing, invoice gaming, and live streaming. The ITU-R BS.1387 Perceptual Evaluation of Audio Quality (PEAQ) model is used to evaluate the amount of masking in the audio signals. Different audio effects e.g., level balance, equalization, dynamic range compression, and spatialization are applied via an iterative Harmony searching algorithm that aims to minimize the masking. In the subjective listening test, the designed system can compete with mixes by professional sound engineers and outperforms mixes by existing auto-mixing systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Advanced Adaptive Filtering Techniques