Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Anuj Sadani; Deepak Kumar

arXiv:2605.13538·cs.CL·May 14, 2026

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Anuj Sadani, Deepak Kumar

PDF

TL;DR

This paper introduces a locale-conditioned few-shot prompting technique for small language models to improve on-device PII substitution, reducing demonstration regurgitation and enhancing multilingual and downstream NER performance.

Contribution

It proposes a novel locale-conditioned rotating few-shot prompting method that mitigates demonstration copying in small language models for PII redaction tasks.

Findings

01

Locale-conditioned prompting fixes demonstration regurgitation.

02

Hybrid perplexity outperforms faker in multilingual evaluation.

03

Faker yields better downstream NER performance than hybrid prompting.

Abstract

Personally Identifiable Information (PII) redaction usually replaces detected entities with placeholder tokens such as [PERSON], destroying the downstream utility of the redacted text for retrieval and Named Entity Recognition (NER) training. We propose a fully on-device pipeline that substitutes PII with consistent, type-preserving fake values: a 1.5 B mixture-of-experts token classifier (openai/privacy-filter) detects spans, a 1-bit Bonsai-1.7B Small Language Model (SLM) proposes contextual surrogates for names, addresses, and dates, and a rule-based generator (faker) handles patterned fields. We report a prompting finding more important than the quantization choice: with naive fixed three-shot demonstrations, the 1-bit SLM regurgitates demonstration outputs verbatim regardless of input; 1.58-bit Ternary-Bonsai-1.7B reproduces byte-identical failures, ruling out quantization as the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.