A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models
Xingmeng Zhao, Ali Niazi, Anthony Rios

TL;DR
This study investigates gender bias in chemical NER models, revealing significant disparities and misclassifications, especially concerning female-related names and contraceptive detection, highlighting the need for bias mitigation.
Contribution
The paper introduces a framework for measuring gender bias in chemical NER models using synthetic and real data, revealing notable biases and disparities.
Findings
Female-related names often misclassified as chemicals
Performance disparities between genders in NER models
Many systems fail to detect contraceptives like birth control
Abstract
Chemical named entity recognition (NER) models are used in many downstream tasks, from adverse drug reaction identification to pharmacoepidemiology. However, it is unknown whether these models work the same for everyone. Performance disparities can potentially cause harm rather than the intended good. This paper assesses gender-related performance disparities in chemical NER systems. We develop a framework for measuring gender bias in chemical NER models using synthetic data and a newly annotated corpus of over 92,405 words with self-identified gender information from Reddit. Our evaluation of multiple biomedical NER models reveals evident biases. For instance, synthetic data suggests female-related names are frequently misclassified as chemicals, especially for brand name mentions. Additionally, we observe performance disparities between female- and male-associated data in both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies
