Enhancing Transformers with Gradient Boosted Decision Trees for NLI   Fine-Tuning

Benjamin Minixhofer; Milan Gritta; Ignacio Iacobacci

arXiv:2105.03791·cs.CL·September 13, 2022

Enhancing Transformers with Gradient Boosted Decision Trees for NLI Fine-Tuning

Benjamin Minixhofer, Milan Gritta, Ignacio Iacobacci

PDF

1 Repo

TL;DR

This paper proposes replacing the standard neural network classification head in NLI tasks with Gradient Boosted Decision Trees, demonstrating improved performance without extra neural network computation.

Contribution

It introduces FreeGBDT, a novel method for integrating GBDTs as classification heads during fine-tuning of language models for NLI tasks.

Findings

01

FreeGBDT improves NLI performance over traditional MLP heads

02

The method requires no additional neural network computation

03

Consistent gains observed across multiple NLI datasets

Abstract

Transfer learning has become the dominant paradigm for many natural language processing tasks. In addition to models being pretrained on large datasets, they can be further trained on intermediate (supervised) tasks that are similar to the target task. For small Natural Language Inference (NLI) datasets, language modelling is typically followed by pretraining on a large (labelled) NLI dataset before fine-tuning with each NLI subtask. In this work, we explore Gradient Boosted Decision Trees (GBDTs) as an alternative to the commonly used Multi-Layer Perceptron (MLP) classification head. GBDTs have desirable properties such as good performance on dense, numerical features and are effective where the ratio of the number of samples w.r.t the number of features is low. We then introduce FreeGBDT, a method of fitting a GBDT head on the features computed during fine-tuning to increase…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huawei-noah/noah-research/tree/master/freegbdt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.