Efficient transfer learning for NLP with ELECTRA

Fran\c{c}ois Mercier

arXiv:2104.02756·cs.CL·April 9, 2021

Efficient transfer learning for NLP with ELECTRA

Fran\c{c}ois Mercier

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether ELECTRA can achieve near state-of-the-art NLP performance in low-resource settings with minimal computational cost, confirming its efficiency claims.

Contribution

The study evaluates ELECTRA's effectiveness in low-resource NLP tasks, providing empirical evidence of its efficiency and performance relative to computational budget.

Findings

01

ELECTRA achieves competitive performance with reduced compute.

02

ELECTRA outperforms some models in low-resource scenarios.

03

The approach confirms high efficiency in NLP tasks.

Abstract

Clark et al. [2020] claims that the ELECTRA approach is highly efficient in NLP performances relative to computation budget. As such, this reproducibility study focus on this claim, summarized by the following question: Can we use ELECTRA to achieve close to SOTA performances for NLP in low-resource settings, in term of compute cost?

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cccwam/rc2020_electra
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsLinear Layer · Attention Is All You Need · Refunds@Expedia|||How do I get a full refund from Expedia? · Weight Decay · WordPiece · Linear Warmup With Linear Decay · Residual Connection · Layer Normalization · Adam · Dropout