# Crosstalk Suppression in a Multi-Channel, Multi-Speaker System Using Acoustic Vector Sensors

**Authors:** Grzegorz Szwoch

PMC · DOI: 10.3390/s25216731 · 2025-11-03

## TL;DR

This paper introduces a method to improve speech recognition in multi-speaker environments by suppressing crosstalk using acoustic vector sensors.

## Contribution

A novel crosstalk suppression algorithm using acoustic vector sensors and source separation for multi-speaker speech recognition.

## Key findings

- The algorithm achieved an SI-SDR improvement of 7.54 dB with source separation.
- Without source separation, the SI-SDR improvement was 19.53 dB.

## Abstract

Automatic speech recognition in a scenario with multiple speakers in a reverberant space, such as a small courtroom, often requires multiple sensors. This leads to a problem of crosstalk that must be removed before the speech-to-text transcription is performed. This paper presents an algorithm intended for application in multi-speaker scenarios requiring speech-to-text transcription, such as court sessions or conferences. The proposed method uses Acoustic Vector Sensors to acquire audio streams. Speaker detection is performed using statistical analysis of the direction of arrival. This information is then used to perform source separation. Next, speakers’ activity in each channel is analyzed, and signal fragments containing direct speech and crosstalk are identified. Crosstalk is then suppressed using a dynamic gain processor, and the resulting audio streams may be passed to a speech recognition system. The algorithm was evaluated using a custom set of speech recordings. An increase in SI-SDR (Scale-Invariant Signal-to-Distortion Ratio) over the unprocessed signal was achieved: 7.54 dB and 19.53 dB for the algorithm with and without the source separation stage, respectively.

## Full-text entities

- **Diseases:** hallucinations (MESH:D006212), Speech (MESH:D013064), injury to (MESH:D014947)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12609598/full.md

---
Source: https://tomesphere.com/paper/PMC12609598