PBa-LLM: Privacy- and Bias-aware NLP using Named-Entity Recognition (NER)
Gonzalo Mancera, Aythami Morales, Julian Fierrez, Ruben Tolosana, Alejandro Penna, Miguel Lopez-Duran, Francisco Jurado, and Alvaro Ortigosa

TL;DR
This paper introduces PBa-LLM, a framework using Named-Entity Recognition to anonymize sensitive data in NLP models, enhancing privacy and reducing bias in high-stakes AI applications like resume scoring.
Contribution
It presents a novel privacy-preserving framework using NER for LLMs and combines bias reduction techniques to create PBa-LLMs, applicable across various NLP tasks.
Findings
Privacy techniques effectively protect candidate confidentiality.
System performance remains stable despite anonymization.
Bias reduction methods successfully mitigate gender bias.
Abstract
The use of Natural Language Processing (NLP) in highstakes AI-based applications has increased significantly in recent years, especially since the emergence of Large Language Models (LLMs). However, despite their strong performance, LLMs introduce important legal/ ethical concerns, particularly regarding privacy, data protection, and transparency. Due to these concerns, this work explores the use of Named- Entity Recognition (NER) to facilitate the privacy-preserving training (or adaptation) of LLMs. We propose a framework that uses NER technologies to anonymize sensitive information in text data, such as personal identities or geographic locations. An evaluation of the proposed privacy-preserving learning framework was conducted to measure its impact on user privacy and system performance in a particular high-stakes and sensitive setup: AI-based resume scoring for recruitment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Privacy-Preserving Technologies in Data · Data Quality and Management
