Stress Test Evaluation of Biomedical Word Embeddings

Vladimir Araujo; Andr\'es Carvallo; Carlos Aspillaga; Camilo Thorne,; Denis Parra

arXiv:2107.11652·cs.CL·July 27, 2021

Stress Test Evaluation of Biomedical Word Embeddings

Vladimir Araujo, Andr\'es Carvallo, Carlos Aspillaga, Camilo Thorne,, Denis Parra

PDF

1 Repo

TL;DR

This paper systematically evaluates the robustness of biomedical word embeddings under stress scenarios like spelling errors and synonyms, revealing vulnerabilities and improvements through adversarial training.

Contribution

It introduces stress test scenarios for biomedical embeddings and demonstrates how adversarial training enhances their robustness and performance.

Findings

01

Models' performance drops significantly under stress scenarios.

02

Adversarial training improves robustness and can outperform original models.

03

Stress tests reveal specific weaknesses and strengths of biomedical embeddings.

Abstract

The success of pretrained word embeddings has motivated their use in the biomedical domain, with contextualized embeddings yielding remarkable results in several biomedical NLP tasks. However, there is a lack of research on quantifying their behavior under severe "stress" scenarios. In this work, we systematically evaluate three language models with adversarial examples -- automatically constructed tests that allow us to examine how robust the models are. We propose two types of stress scenarios focused on the biomedical named entity recognition (NER) task, one inspired by spelling errors and another based on the use of synonyms for medical terms. Our experiments with three benchmarks show that the performance of the original models decreases considerably, in addition to revealing their weaknesses and strengths. Finally, we show that adversarial training causes the models to improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ialab-puc/BioNLP-StressTest
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.