Curricular Transfer Learning for Sentence Encoded Tasks

Jader Martins Camboim de S\'a; Matheus Ferraroni Sanches; Rafael Roque; de Souza; J\'ulio Cesar dos Reis; Leandro Aparecido Villas

arXiv:2308.01849·cs.CL·August 4, 2023

Curricular Transfer Learning for Sentence Encoded Tasks

Jader Martins Camboim de S\'a, Matheus Ferraroni Sanches, Rafael Roque, de Souza, J\'ulio Cesar dos Reis, Leandro Aparecido Villas

PDF

Open Access

TL;DR

This paper introduces a curriculum-based transfer learning approach for NLP tasks, improving model adaptation across different data distributions, especially in conversational environments.

Contribution

It proposes a novel sequence of pre-training steps guided by data hacking and grammar analysis to enhance transfer learning in NLP.

Findings

01

Significant improvement on MultiWoZ task

02

Effective adaptation across distribution shifts

03

Outperforms existing pre-training methods

Abstract

Fine-tuning language models in a downstream task is the standard approach for many state-of-the-art methodologies in the field of NLP. However, when the distribution between the source task and target task drifts, \textit{e.g.}, conversational environments, these gains tend to be diminished. This article proposes a sequence of pre-training steps (a curriculum) guided by "data hacking" and grammar analysis that allows further gradual adaptation between pre-training distributions. In our experiments, we acquire a considerable improvement from our method compared to other known pre-training approaches for the MultiWoZ task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems