Exploring Cross-lingual Textual Style Transfer with Large Multilingual   Language Models

Daniil Moskovskiy; Daryna Dementieva; Alexander Panchenko

arXiv:2206.02252·cs.CL·June 7, 2022

Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models

Daniil Moskovskiy, Daryna Dementieva, Alexander Panchenko

PDF

Open Access 1 Repo

TL;DR

This paper investigates the ability of large multilingual models to perform detoxification across multiple languages without fine-tuning, revealing their strengths in multilingual style transfer but limitations in cross-lingual detoxification.

Contribution

It explores cross-lingual detoxification with large models and demonstrates the necessity of language-specific fine-tuning for effective detoxification.

Findings

01

Multilingual models can perform style transfer across languages.

02

Models struggle with cross-lingual detoxification without fine-tuning.

03

Fine-tuning on specific languages remains necessary for detoxification.

Abstract

Detoxification is a task of generating text in polite style while preserving meaning and fluency of the original toxic text. Existing detoxification methods are designed to work in one exact language. This work investigates multilingual and cross-lingual detoxification and the behavior of large multilingual models like in this setting. Unlike previous works we aim to make large language models able to perform detoxification without direct fine-tuning in given language. Experiments show that multilingual models are capable of performing multilingual style transfer. However, models are not able to perform cross-lingual detoxification and direct fine-tuning on exact language is inevitable.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

skoltech-nlp/multilingual_detox
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis