SCALE: Synergized Collaboration of Asymmetric Language Translation   Engines

Xin Cheng; Xun Wang; Tao Ge; Si-Qing Chen; Furu Wei and; Dongyan Zhao; Rui Yan

arXiv:2309.17061·cs.CL·October 2, 2023·1 cites

SCALE: Synergized Collaboration of Asymmetric Language Translation Engines

Xin Cheng, Xun Wang, Tao Ge, Si-Qing Chen, Furu Wei and, Dongyan Zhao, Rui Yan

PDF

Open Access 1 Repo

TL;DR

SCALE is a collaborative framework that unites specialized translation models and large language models to improve low-resource language translation, outperforming existing models without extensive fine-tuning.

Contribution

The paper introduces SCALE, a novel method that synergizes specialized translation models with large language models, enhancing translation quality and flexibility in low-resource settings.

Findings

01

Outperforms GPT-4 and NLLB in low-resource translation tasks.

02

Achieves a 4 BLEURT score improvement in Xhosa-English translation.

03

Effectively uses English-centric STM as a pivot for multiple language pairs.

Abstract

In this paper, we introduce SCALE, a collaborative framework that connects compact Specialized Translation Models (STMs) and general-purpose Large Language Models (LLMs) as one unified translation engine. By introducing translation from STM into the triplet in-context demonstrations, SCALE unlocks refinement and pivoting ability of LLM, thus mitigating language bias of LLM and parallel data bias of STM, enhancing LLM speciality without sacrificing generality, and facilitating continual learning without expensive LLM fine-tuning. Our comprehensive experiments show that SCALE significantly outperforms both few-shot LLMs (GPT-4) and specialized models (NLLB) in challenging low-resource settings. Moreover, in Xhosa to English translation, SCALE experiences consistent improvement by a 4 BLEURT score without tuning LLM and surpasses few-shot GPT-4 by 2.5 COMET score and 3.8 BLEURT score when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hannibal046/scale
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Linear Layer · Label Smoothing · Absolute Position Encodings · Adam · Residual Connection · Layer Normalization · Softmax