# Speech Separation Using Gain-Adapted Factorial Hidden Markov Models

**Authors:** Martin H. Radfar, Richard M. Dansereau, Willy Wong

arXiv: 1901.07604 · 2019-01-24

## TL;DR

This paper introduces GFHMM, a gain-adapted factorial hidden Markov model, for single-channel speech separation that effectively handles unknown gain differences between speakers, outperforming previous FHMM methods.

## Contribution

The paper proposes GFHMM, extending FHMM to account for unknown gain factors in speech separation, with a novel inference method that improves performance.

## Key findings

- Significantly outperforms FHMM in experiments.
- Effective on mixtures with gain differences up to 15 dB.
- Low computational overhead for inference.

## Abstract

We present a new probabilistic graphical model which generalizes factorial hidden Markov models (FHMM) for the problem of single-channel speech separation (SCSS) in which we wish to separate the two speech signals $X(t)$ and $V(t)$ from a single recording of their mixture $Y(t)=X(t)+V(t)$ using the trained models of the speakers' speech signals. Current techniques assume the data used in the training and test phases of the separation model have the same loudness. In this paper, we introduce GFHMM, gain adapted FHMM, to extend SCSS to the general case in which $Y(t)=g_xX(t)+g_vV(t)$, where $g_x$ and $g_v$ are unknown gain factors. GFHMM consists of two independent-state HMMs and a hidden node which model spectral patterns and gain difference, respectively. A novel inference method is presented using the Viterbi algorithm and quadratic optimization with minimal computational overhead. Experimental results, conducted on 180 mixtures with gain differences from 0 to 15~dB, show that the proposed technique significantly outperforms FHMM and its memoryless counterpart, i.e., vector quantization (VQ)-based SCSS.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.07604/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1901.07604/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/1901.07604/full.md

---
Source: https://tomesphere.com/paper/1901.07604