# Query-by-Example Search with Discriminative Neural Acoustic Word   Embeddings

**Authors:** Shane Settle, Keith Levin, Herman Kamper, Karen Livescu

arXiv: 1706.03818 · 2017-06-14

## TL;DR

This paper introduces a neural network-based approach for query-by-example speech search using discriminative acoustic word embeddings, significantly improving accuracy and efficiency over previous template-based methods.

## Contribution

The authors develop a recurrent neural network model to produce discriminative acoustic word embeddings, enhancing query-by-example search performance and speed.

## Key findings

- Embeddings outperform DTW-based methods in accuracy.
- Neural embeddings improve run-time efficiency.
- Recurrent neural networks effectively discriminate words.

## Abstract

Query-by-example search often uses dynamic time warping (DTW) for comparing queries and proposed matching segments. Recent work has shown that comparing speech segments by representing them as fixed-dimensional vectors --- acoustic word embeddings --- and measuring their vector distance (e.g., cosine distance) can discriminate between words more accurately than DTW-based approaches. We consider an approach to query-by-example search that embeds both the query and database segments according to a neural model, followed by nearest-neighbor search to find the matching segments. Earlier work on embedding-based query-by-example, using template-based acoustic word embeddings, achieved competitive performance. We find that our embeddings, based on recurrent neural networks trained to optimize word discrimination, achieve substantial improvements in performance and run-time efficiency over the previous approaches.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.03818/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1706.03818/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1706.03818/full.md

---
Source: https://tomesphere.com/paper/1706.03818