# A Comparative Study on Vocabulary Reduction for Phrase Table Smoothing

**Authors:** Yunsu Kim, Andreas Guta, Joern Wuebker, Hermann Ney

arXiv: 1901.01574 · 2019-01-08

## TL;DR

This paper systematically compares vocabulary reduction techniques for phrase table smoothing, showing that vocabulary choice has minimal impact and that reduction is especially beneficial for large-scale phrase tables.

## Contribution

It provides empirical evidence that vocabulary reduction effectively smooths large-scale phrase tables without being sensitive to vocabulary choice.

## Key findings

- Vocabulary choice does not significantly affect smoothing performance.
- Vocabulary reduction is more effective for large-scale phrase tables.
- Standard phrase translation models are extremely sparse.

## Abstract

This work systematically analyzes the smoothing effect of vocabulary reduction for phrase translation models. We extensively compare various word-level vocabularies to show that the performance of smoothing is not significantly affected by the choice of vocabulary. This result provides empirical evidence that the standard phrase translation model is extremely sparse. Our experiments also reveal that vocabulary reduction is more effective for smoothing large-scale phrase tables.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.01574/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1901.01574/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/1901.01574/full.md

---
Source: https://tomesphere.com/paper/1901.01574