Multitask Fine-Tuning and Generative Adversarial Learning for Improved Auxiliary Classification
Christopher Sun, Abishek Satish

TL;DR
This paper introduces a multitask BERT model optimized for three NLP tasks and applies generative adversarial learning to create a generator that mimics BERT embeddings, enhancing semi-supervised learning.
Contribution
It presents a novel multitask BERT architecture with layer sharing and gradient techniques, and introduces AC-GAN-BERT, a GAN-based framework for semi-supervised learning with BERT embeddings.
Findings
Achieved high accuracy on sentiment, paraphrase, and textual similarity tasks.
Successfully generated embeddings that correlate with class labels, avoiding mode collapse.
Validated the effectiveness of GAN-BERT for semi-supervised NLP tasks.
Abstract
In this study, we implement a novel BERT architecture for multitask fine-tuning on three downstream tasks: sentiment classification, paraphrase detection, and semantic textual similarity prediction. Our model, Multitask BERT, incorporates layer sharing and a triplet architecture, custom sentence pair tokenization, loss pairing, and gradient surgery. Such optimizations yield a 0.516 sentiment classification accuracy, 0.886 paraphase detection accuracy, and 0.864 semantic textual similarity correlation on test data. We also apply generative adversarial learning to BERT, constructing a conditional generator model that maps from latent space to create fake embeddings in . These fake embeddings are concatenated with real BERT embeddings and passed into a discriminator model for auxiliary classification. Using this framework, which we refer to as AC-GAN-BERT, we conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Anomaly Detection Techniques and Applications
