Sockeye: A Toolkit for Neural Machine Translation
Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem, Sokolov, Ann Clifton, Matt Post

TL;DR
Sockeye is an open-source, scalable toolkit for neural machine translation supporting multiple architectures, offering competitive performance and extensive features for researchers and practitioners.
Contribution
Introduces Sockeye, a versatile NMT toolkit with support for various architectures, optimizations, and benchmarking, enhancing research and production workflows.
Findings
Sockeye achieves competitive BLEU scores on WMT benchmarks.
Transformer implementation in Sockeye outperforms other architectures.
The toolkit is open-source and facilitates research and deployment.
Abstract
We describe Sockeye (version 1.12), an open-source sequence-to-sequence toolkit for Neural Machine Translation (NMT). Sockeye is a production-ready framework for training and applying models as well as an experimental platform for researchers. Written in Python and built on MXNet, the toolkit offers scalable training and inference for the three most prominent encoder-decoder architectures: attentional recurrent neural networks, self-attentional transformers, and fully convolutional networks. Sockeye also supports a wide range of optimizers, normalization and regularization techniques, and inference improvements from current NMT literature. Users can easily run standard training recipes, explore different model settings, and incorporate new ideas. In this paper, we highlight Sockeye's features and benchmark it against other NMT toolkits on two language arcs from the 2017 Conference on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
