Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models
Anuj Sadani, Deepak Kumar

TL;DR
This paper introduces a locale-conditioned few-shot prompting technique for small language models to improve on-device PII substitution, reducing demonstration regurgitation and enhancing multilingual and downstream NER performance.
Contribution
It proposes a novel locale-conditioned rotating few-shot prompting method that mitigates demonstration copying in small language models for PII redaction tasks.
Findings
Locale-conditioned prompting fixes demonstration regurgitation.
Hybrid perplexity outperforms faker in multilingual evaluation.
Faker yields better downstream NER performance than hybrid prompting.
Abstract
Personally Identifiable Information (PII) redaction usually replaces detected entities with placeholder tokens such as [PERSON], destroying the downstream utility of the redacted text for retrieval and Named Entity Recognition (NER) training. We propose a fully on-device pipeline that substitutes PII with consistent, type-preserving fake values: a 1.5 B mixture-of-experts token classifier (openai/privacy-filter) detects spans, a 1-bit Bonsai-1.7B Small Language Model (SLM) proposes contextual surrogates for names, addresses, and dates, and a rule-based generator (faker) handles patterned fields. We report a prompting finding more important than the quantization choice: with naive fixed three-shot demonstrations, the 1-bit SLM regurgitates demonstration outputs verbatim regardless of input; 1.58-bit Ternary-Bonsai-1.7B reproduces byte-identical failures, ruling out quantization as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
