Ensemble Learning-Based Approach for Improving Generalization Capability   of Machine Reading Comprehension Systems

Razieh Baradaran; Hossein Amirkhani

arXiv:2107.00368·cs.CL·July 16, 2021

Ensemble Learning-Based Approach for Improving Generalization Capability of Machine Reading Comprehension Systems

Razieh Baradaran, Hossein Amirkhani

PDF

TL;DR

This paper explores ensemble learning techniques to enhance the generalization and out-of-distribution accuracy of Machine Reading Comprehension systems without retraining large models, demonstrating improved robustness across datasets.

Contribution

It introduces ensemble methods applied to pre-trained MRC models, analyzing their effectiveness and robustness in out-of-distribution scenarios without additional training.

Findings

01

Ensemble methods improve out-of-distribution accuracy.

02

Heterogeneous and hybrid ensembles are most effective.

03

Ensembles increase robustness against data shifts.

Abstract

Machine Reading Comprehension (MRC) is an active field in natural language processing with many successful developed models in recent years. Despite their high in-distribution accuracy, these models suffer from two issues: high training cost and low out-of-distribution accuracy. Even though some approaches have been presented to tackle the generalization problem, they have high, intolerable training costs. In this paper, we investigate the effect of ensemble learning approach to improve generalization of MRC systems without retraining a big model. After separately training the base models with different structures on different datasets, they are ensembled using weighting and stacking approaches in probabilistic and non-probabilistic settings. Three configurations are investigated including heterogeneous, homogeneous, and hybrid on eight datasets and six state-of-the-art models. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.