Risk Assessment and Security Analysis of Large Language Models
Xiaoyan Zhang, Dongyang Lyu, Xiaoqi Li

TL;DR
This paper presents a comprehensive framework for dynamic risk assessment and hierarchical defense of large language models, addressing security challenges like data leaks, malicious inputs, and bias in critical applications.
Contribution
It introduces a novel risk assessment system combining static and dynamic indicators, and a hybrid defense model using BERT-CRF, adversarial training, and neural watermarking.
Findings
Effective identification of concealed attacks like role escape
Rapid risk evaluation capabilities demonstrated
Enhanced security in financial industry applications
Abstract
As large language models (LLMs) expose systemic security challenges in high risk applications, including privacy leaks, bias amplification, and malicious abuse, there is an urgent need for a dynamic risk assessment and collaborative defence framework that covers their entire life cycle. This paper focuses on the security problems of large language models (LLMs) in critical application scenarios, such as the possibility of disclosure of user data, the deliberate input of harmful instructions, or the models bias. To solve these problems, we describe the design of a system for dynamic risk assessment and a hierarchical defence system that allows different levels of protection to cooperate. This paper presents a risk assessment system capable of evaluating both static and dynamic indicators simultaneously. It uses entropy weighting to calculate essential data, such as the frequency of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
