Combining Denoising Autoencoders with Contrastive Learning to fine-tune   Transformer Models

Alejo Lopez-Avila; V\'ictor Su\'arez-Paniagua

arXiv:2405.14437·cs.CL·May 24, 2024

Combining Denoising Autoencoders with Contrastive Learning to fine-tune Transformer Models

Alejo Lopez-Avila, V\'ictor Su\'arez-Paniagua

PDF

Open Access 1 Repo

TL;DR

This paper introduces a three-phase fine-tuning method for Transformer models that combines Denoising Autoencoders, Contrastive Learning, and data augmentation to improve classification performance on NLP tasks.

Contribution

It proposes a novel three-phase approach integrating Denoising Autoencoders and Contrastive Learning with data augmentation for better transfer learning in NLP.

Findings

01

Enhanced classification accuracy on multiple datasets.

02

Effective handling of unbalanced datasets with new augmentation.

03

Improved model adaptation through combined techniques.

Abstract

Recently, using large pretrained Transformer models for transfer learning tasks has evolved to the point where they have become one of the flagship trends in the Natural Language Processing (NLP) community, giving rise to various outlooks such as prompt-based, adapters or combinations with unsupervised approaches, among many others. This work proposes a 3 Phase technique to adjust a base model for a classification task. First, we adapt the model's signal to the data distribution by performing further training with a Denoising Autoencoder (DAE). Second, we adjust the representation space of the output to the corresponding classes by clustering through a Contrastive Learning (CL) method. In addition, we introduce a new data augmentation approach for Supervised Contrastive Learning to correct the unbalanced datasets. Third, we apply fine-tuning to delimit the predefined categories. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vsuarezpaniagua/3-phase_finetuning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Image and Signal Denoising Methods

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Multi-Head Attention · Residual Connection · Byte Pair Encoding · Label Smoothing · Adam · Absolute Position Encodings · Dropout