SecureCAI: Injection-Resilient LLM Assistants for Cybersecurity Operations

Mohammed Himayath Ali; Mohammed Aqib Abdullah; Mohammed Mudassir Uddin; Shahnawaz Alam

arXiv:2601.07835·cs.CR·January 13, 2026

SecureCAI: Injection-Resilient LLM Assistants for Cybersecurity Operations

Mohammed Himayath Ali, Mohammed Aqib Abdullah, Mohammed Mudassir Uddin, Shahnawaz Alam

PDF

Open Access

TL;DR

SecureCAI is a defense framework for large language models in cybersecurity, reducing prompt injection vulnerabilities by 94.7% while maintaining high accuracy, through security-aware guardrails and adaptive learning.

Contribution

It introduces a novel security-aware framework extending Constitutional AI with adaptive mechanisms to defend against prompt injection attacks in cybersecurity applications.

Findings

01

Reduces attack success rates by 94.7%

02

Maintains 95.1% accuracy on security tasks

03

Achieves high constitution adherence scores over 0.92

Abstract

Large Language Models have emerged as transformative tools for Security Operations Centers, enabling automated log analysis, phishing triage, and malware explanation; however, deployment in adversarial cybersecurity environments exposes critical vulnerabilities to prompt injection attacks where malicious instructions embedded in security artifacts manipulate model behavior. This paper introduces SecureCAI, a novel defense framework extending Constitutional AI principles with security-aware guardrails, adaptive constitution evolution, and Direct Preference Optimization for unlearning unsafe response patterns, addressing the unique challenges of high-stakes security contexts where traditional safety mechanisms prove insufficient against sophisticated adversarial manipulation. Experimental evaluation demonstrates that SecureCAI reduces attack success rates by 94.7% compared to baseline…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Network Security and Intrusion Detection · Information and Cyber Security