PyMarian: Fast Neural Machine Translation and Evaluation in Python
Thamme Gowda, Roman Grundkiewicz, Elijah Rippeth, Matt Post, Marcin, Junczys-Dowmunt

TL;DR
PyMarian provides a Python interface to the Marian NMT toolkit, enabling fast neural machine translation and evaluation, including state-of-the-art metrics with significant speed improvements.
Contribution
It introduces a Python interface to Marian NMT that enhances usability and speed, especially for computing advanced evaluation metrics.
Findings
COMET metric computation speedup of up to 7.8×
Seamless integration with Python tools and environments
Availability of PyMarian on PyPI for easy installation
Abstract
The deep learning language of choice these days is Python; measured by factors such as available libraries and technical support, it is hard to beat. At the same time, software written in lower-level programming languages like C++ retain advantages in speed. We describe a Python interface to Marian NMT, a C++-based training and inference toolkit for sequence-to-sequence models, focusing on machine translation. This interface enables models trained with Marian to be connected to the rich, wide range of tools available in Python. A highlight of the interface is the ability to compute state-of-the-art COMET metrics from Python but using Marian's inference engine, with a speedup factor of up to 7.8 the existing implementations. We also briefly spotlight a number of other integrations, including Jupyter notebooks, connection with prebuilt models, and a web app interface provided with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques
