CivicShield: A Cross-Domain Defense-in-Depth Framework for Securing Government-Facing AI Chatbots Against Multi-Turn Adversarial Attacks

KrishnaSaiReddy Patil

arXiv:2603.29062·cs.CR·April 1, 2026

CivicShield: A Cross-Domain Defense-in-Depth Framework for Securing Government-Facing AI Chatbots Against Multi-Turn Adversarial Attacks

KrishnaSaiReddy Patil

PDF

TL;DR

CivicShield is a comprehensive, layered defense framework designed to protect government AI chatbots from multi-turn adversarial attacks, combining formal verification, anomaly detection, and human oversight.

Contribution

The paper introduces CivicShield, a novel multi-layer defense-in-depth framework integrating diverse security principles to significantly improve chatbot robustness against complex attacks.

Findings

01

Layered defenses reduce attack success probability by 10-100 times.

02

Simulation shows 72.9% detection rate with 2.9% false positives.

03

Framework maintains high detection of crescendo and drift attacks.

Abstract

LLM-based chatbots in government services face critical security gaps. Multi-turn adversarial attacks achieve over 90% success against current defenses, and single-layer guardrails are bypassed with similar rates. We present CivicShield, a cross-domain defense-in-depth framework for government-facing AI chatbots. Drawing on network security, formal verification, biological immune systems, aviation safety, and zero-trust cryptography, CivicShield introduces seven defense layers: (1) zero-trust foundation with capability-based access control, (2) perimeter input validation, (3) semantic firewall with intent classification, (4) conversation state machine with safety invariants, (5) behavioral anomaly detection, (6) multi-model consensus verification, and (7) graduated human-in-the-loop escalation. We present a formal threat model covering 8 multi-turn attack families, map the framework to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.