XDLM: Cross-lingual Diffusion Language Model for Machine Translation

Linyao Chen; Aosong Feng; Boming Yang; Zihui Li

arXiv:2307.13560·cs.CL·August 1, 2023

XDLM: Cross-lingual Diffusion Language Model for Machine Translation

Linyao Chen, Aosong Feng, Boming Yang, Zihui Li

PDF

Open Access

TL;DR

XDLM introduces a novel cross-lingual diffusion model for machine translation, leveraging a new training objective and outperforming existing diffusion and Transformer models on multiple benchmarks.

Contribution

The paper presents XDLM, the first cross-lingual diffusion model for machine translation, with a new pretraining objective TLDM and a fine-tuning approach, advancing cross-lingual NLP capabilities.

Findings

01

Outperforms diffusion and Transformer baselines on benchmarks

02

Introduces TLDM training objective for cross-lingual mapping

03

Demonstrates effectiveness of diffusion models in machine translation

Abstract

Recently, diffusion models have excelled in image generation tasks and have also been applied to neural language processing (NLP) for controllable text generation. However, the application of diffusion models in a cross-lingual setting is less unexplored. Additionally, while pretraining with diffusion models has been studied within a single language, the potential of cross-lingual pretraining remains understudied. To address these gaps, we propose XDLM, a novel Cross-lingual diffusion model for machine translation, consisting of pretraining and fine-tuning stages. In the pretraining stage, we propose TLDM, a new training objective for mastering the mapping between different languages; in the fine-tuning stage, we build up the translation system based on the pretrained model. We evaluate the result on several machine translation benchmarks and outperformed both diffusion and Transformer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Softmax · Dense Connections · Dropout · Byte Pair Encoding · Position-Wise Feed-Forward Layer · Residual Connection