The futility of STILTs for the classification of lexical borrowings in   Spanish

Javier de la Rosa

arXiv:2109.08607·cs.CL·September 20, 2021·1 cites

The futility of STILTs for the classification of lexical borrowings in Spanish

Javier de la Rosa

PDF

Open Access

TL;DR

This study evaluates the effectiveness of supplementary training on intermediate tasks (STILTs) for classifying lexical borrowings in Spanish, finding no significant improvements over direct fine-tuning of transformer models.

Contribution

The paper provides an empirical assessment of STILTs in the context of lexical borrowing detection in Spanish, demonstrating their limited benefit in this task.

Findings

01

STILTs do not improve classification performance over direct fine-tuning.

02

Multilingual models trained on small language subsets outperform multilingual BERT.

03

Multilingual RoBERTa performs better than multilingual BERT for this task.

Abstract

The first edition of the IberLEF 2021 shared task on automatic detection of borrowings (ADoBo) focused on detecting lexical borrowings that appeared in the Spanish press and that have recently been imported into the Spanish language. In this work, we tested supplementary training on intermediate labeled-data tasks (STILTs) from part of speech (POS), named entity recognition (NER), code-switching, and language identification approaches to the classification of borrowings at the token level using existing pre-trained transformer-based language models. Our extensive experimental results suggest that STILTs do not provide any improvement over direct fine-tuning of multilingual models. However, multilingual models trained on small subsets of languages perform reasonably better than multilingual BERT but not as good as multilingual RoBERTa for the given dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Second Language Acquisition and Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Adam · Residual Connection · Layer Normalization · WordPiece · Dense Connections · Weight Decay