A Million Tweets Are Worth a Few Points: Tuning Transformers for   Customer Service Tasks

Amir Hadifar; Sofie Labat; V\'eronique Hoste; Chris Develder and; Thomas Demeester

arXiv:2104.07944·cs.CL·April 19, 2021

A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks

Amir Hadifar, Sofie Labat, V\'eronique Hoste, Chris Develder and, Thomas Demeester

PDF

1 Repo

TL;DR

This paper investigates how to effectively adapt multilingual transformer models for customer service tasks on social media, demonstrating that in-domain pretraining improves performance, especially for non-English languages.

Contribution

It provides a comprehensive evaluation of pretraining and finetuning strategies for multilingual customer service NLP tasks using social media data.

Findings

01

In-domain pretraining improves task performance.

02

Multilingual models benefit more in non-English settings.

03

Pretraining on social media data boosts downstream task accuracy.

Abstract

In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hadifar/customerservicetasks
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methodstravel james