Analyzing Bagging Methods for Language Models

Pranab Islam; Shaan Khosla; Arthur Lok; Mudit Saxena

arXiv:2207.09099·cs.CL·July 20, 2022

Analyzing Bagging Methods for Language Models

Pranab Islam, Shaan Khosla, Arthur Lok, Mudit Saxena

PDF

Open Access

TL;DR

This paper analyzes the effectiveness of bagging methods for language models, finding that ensembling often matches but does not significantly outperform single models of similar size, with some benefits in variance reduction.

Contribution

It provides a systematic comparison of bagged ensembles versus single language models, highlighting their comparable performance and specific advantages like variance reduction.

Findings

01

Bagging methods are roughly equivalent to single models in performance.

02

Ensembling offers benefits in variance reduction.

03

Minor performance improvements observed in certain scenarios.

Abstract

Modern language models leverage increasingly large numbers of parameters to achieve performance on natural language understanding tasks. Ensembling these models in specific configurations for downstream tasks show even further performance improvements. In this paper, we perform an analysis of bagging language models and compare single language models to bagged ensembles that are roughly equivalent in terms of final model size. We explore an array of model bagging configurations for natural language understanding tasks with final ensemble sizes ranging from 300M parameters to 1.5B parameters and determine that our ensembling methods are at best roughly equivalent to single LM baselines. We note other positive effects of bagging and pruning in specific scenarios according to findings in our experiments such as variance reduction and minor performance improvements.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsPruning