The Dark Side of the Language: Pre-trained Transformers in the DarkNet

Leonardo Ranaldi; Aria Nourbakhsh; Arianna Patrizi; Elena Sofia; Ruzzetti; Dario Onorati; Francesca Fallucchi; Fabio Massimo Zanzotto

arXiv:2201.05613·cs.CL·May 7, 2024

The Dark Side of the Language: Pre-trained Transformers in the DarkNet

Leonardo Ranaldi, Aria Nourbakhsh, Arianna Patrizi, Elena Sofia, Ruzzetti, Dario Onorati, Francesca Fallucchi, Fabio Massimo Zanzotto

PDF

TL;DR

This paper investigates the performance of pre-trained Transformers on unseen DarkNet sentences, revealing that domain adaptation is crucial for their success, while neural networks perform comparably without extensive pre-training.

Contribution

It demonstrates that pre-trained Transformers require extreme domain adaptation to excel on DarkNet data, and neural networks perform similarly without large pre-training datasets.

Findings

01

Neural networks perform on par with Transformers without extensive pre-training.

02

Transformers need domain-specific retraining to achieve high performance.

03

Pre-training corpora provide unexpected benefits for Transformers.

Abstract

Pre-trained Transformers are challenging human performances in many NLP tasks. The massive datasets used for pre-training seem to be the key to their success on existing tasks. In this paper, we explore how a range of pre-trained Natural Language Understanding models perform on definitely unseen sentences provided by classification tasks over a DarkNet corpus. Surprisingly, results show that syntactic and lexical neural networks perform on par with pre-trained Transformers even after fine-tuning. Only after what we call extreme domain adaptation, that is, retraining with the masked language model task on all the novel corpus, pre-trained Transformers reach their standard high results. This suggests that huge pre-training corpora may give Transformers unexpected help since they are exposed to many of the possible sentences.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.