Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts

Md. Mehedi Hasan; Sk Tanzir Mehedi; Ziaur Rahman; Rafid Mostafiz; and Md. Abir Hossain

arXiv:2510.22628·cs.CR·May 4, 2026

Sentra-Guard: A Real-Time Multilingual Defense Against Adversarial LLM Prompts

Md. Mehedi Hasan, Sk Tanzir Mehedi, Ziaur Rahman, Rafid Mostafiz, and Md. Abir Hossain

PDF

TL;DR

Sentra-Guard is a real-time, multilingual, modular system that detects and mitigates adversarial prompts targeting large language models with high accuracy and low attack success rate.

Contribution

It introduces a hybrid classifier-retriever architecture with multilingual support and human-in-the-loop feedback for adaptive adversarial prompt defense.

Findings

01

Achieves 99.96% detection rate with F1 score of 1.00

02

Reduces attack success rate to 0.004%

03

Outperforms existing baselines like LlamaGuard-2 and OpenAI Moderation

Abstract

This paper presents a real-time modular defense system named Sentra-Guard. The system detects and mitigates jailbreak and prompt injection attacks targeting large language models (LLMs). The framework uses a hybrid architecture with FAISS-indexed SBERT embedding representations that capture the semantic meaning of prompts, combined with fine-tuned transformer classifiers, which are machine learning models specialized for distinguishing between benign and adversarial language inputs. It identifies adversarial prompts in both direct and obfuscated attack vectors. A core innovation is the classifier-retriever fusion module, which dynamically computes context-aware risk scores that estimate how likely a prompt is to be adversarial based on its content and context. The framework ensures multilingual resilience with a language-agnostic preprocessing layer. This component automatically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.