An Empirical Study on the Bugs Found while Reusing Pre-trained Natural   Language Processing Models

Rangeet Pan; Sumon Biswas; Mohna Chakraborty; Breno Dantas Cruz,; Hridesh Rajan

arXiv:2212.00105·cs.SE·December 2, 2022·1 cites

An Empirical Study on the Bugs Found while Reusing Pre-trained Natural Language Processing Models

Rangeet Pan, Sumon Biswas, Mohna Chakraborty, Breno Dantas Cruz,, Hridesh Rajan

PDF

Open Access

TL;DR

This study analyzes 984 bugs from 11 popular NLP pre-trained models to understand their types, causes, and impacts, revealing challenges like robustness issues, bias propagation, and resource consumption.

Contribution

It provides a comprehensive taxonomy of bugs in NLP model reuse, highlighting key issues and patterns to guide future bug reduction efforts.

Findings

01

Limited access to model internals affects robustness.

02

Input validation issues propagate bias.

03

High resource use leads to crashes.

Abstract

In NLP, reusing pre-trained models instead of training from scratch has gained popularity; however, NLP models are mostly black boxes, very large, and often require significant resources. To ease, models trained with large corpora are made available, and developers reuse them for different problems. In contrast, developers mostly build their models from scratch for traditional DL-related problems. By doing so, they have control over the choice of algorithms, data processing, model structure, tuning hyperparameters, etc. Whereas in NLP, due to the reuse of the pre-trained models, NLP developers are limited to little to no control over such design decisions. They either apply tuning or transfer learning on pre-trained models to meet their requirements. Also, NLP models and their corresponding datasets are significantly larger than the traditional DL models and require heavy computation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques