# Multilingual Bottleneck Features for Query by Example Spoken Term   Detection

**Authors:** Dhananjay Ram, Lesly Miculicich, Herv\'e Bourlard

arXiv: 1907.00443 · 2019-07-02

## TL;DR

This paper investigates the use of multilingual and residual network-based bottleneck features to improve query by example spoken term detection, demonstrating significant performance gains on challenging datasets.

## Contribution

It introduces residual network-based bottleneck features for QbE-STD and compares their effectiveness with traditional feed forward network features.

## Key findings

- Residual network features outperform feed forward network features.
- Multilingual features enhance detection performance.
- Significant improvements observed on QUESST 2014 database.

## Abstract

State of the art solutions to query by example spoken term detection (QbE-STD) usually rely on bottleneck feature representation of the query and audio document to perform dynamic time warping (DTW) based template matching. Here, we present a study on QbE-STD performance using several monolingual as well as multilingual bottleneck features extracted from feed forward networks. Then, we propose to employ residual networks (ResNet) to estimate the bottleneck features and show significant improvements over the corresponding feed forward network based features. The neural networks are trained on GlobalPhone corpus and QbE-STD experiments are performed on a very challenging QUESST 2014 database.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.00443/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1907.00443/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1907.00443/full.md

---
Source: https://tomesphere.com/paper/1907.00443