A Comparative Study of Light-weight Language Models for PII Masking and their Deployment for Real Conversational Texts

Prabigya Acharya; Liza Shrestha

arXiv:2512.18608·cs.CL·December 23, 2025

A Comparative Study of Light-weight Language Models for PII Masking and their Deployment for Real Conversational Texts

Prabigya Acharya, Liza Shrestha

PDF

Open Access

TL;DR

This study compares lightweight encoder-decoder and decoder-only models for PII masking in conversational texts, showing they can perform comparably to large models with benefits in efficiency and control, suitable for real-time applications.

Contribution

It demonstrates that lightweight models like T5-small and Mistral can match large models in PII masking performance, highlighting their advantages in efficiency and controllability for deployment.

Findings

01

Lightweight models achieve comparable PII masking performance to large models.

02

Label normalization improves PII masking accuracy across models.

03

Mistral has higher recall but greater latency; T5 offers better control and lower cost.

Abstract

Automated masking of Personally Identifiable Information (PII) is critical for privacy-preserving conversational systems. While current frontier large language models demonstrate strong PII masking capabilities, concerns about data handling and computational costs motivate exploration of whether lightweight models can achieve comparable performance. We compare encoder-decoder and decoder-only architectures by fine-tuning T5-small and Mistral-Instruct-v0.3 on English datasets constructed from the AI4Privacy benchmark. We create different dataset variants to study label standardization and PII representation, covering 24 standardized PII categories and higher-granularity settings. Evaluation using entity-level and character-level metrics, type accuracy, and exact match shows that both lightweight models achieve performance comparable to frontier LLMs for PII masking tasks. Label…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Privacy, Security, and Data Protection · Mobile Crowdsensing and Crowdsourcing