Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis
Matteo Esposito, Francesco Palagiano, Valentina Lenarduzzi, Davide, Taibi

TL;DR
This study evaluates the effectiveness of large language models, especially Retrieval-Augmented Generation and fine-tuned versions, in risk analysis, showing they can be faster and uncover hidden risks despite slightly lower accuracy than human experts.
Contribution
It provides the first empirical comparison of LLMs with human experts in mission-critical risk analysis, highlighting their strengths and optimal use cases.
Findings
RAG-assisted LLMs have lowest hallucination rates
LLMs are quicker and more actionable than humans
Human experts outperform LLMs in accuracy
Abstract
Context. Risk analysis assesses potential risks in specific scenarios. Risk analysis principles are context-less; the same methodology can be applied to a risk connected to health and information technology security. Risk analysis requires a vast knowledge of national and international regulations and standards and is time and effort-intensive. A large language model can quickly summarize information in less time than a human and can be fine-tuned to specific tasks. Aim. Our empirical study aims to investigate the effectiveness of Retrieval-Augmented Generation and fine-tuned LLM in risk analysis. To our knowledge, no prior study has explored its capabilities in risk analysis. Method. We manually curated 193 unique scenarios leading to 1283 representative samples from over 50 mission-critical analyses archived by the industrial context team in the last five years. We compared the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRisk and Safety Analysis · Software Reliability and Analysis Research · Software Engineering Techniques and Practices
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · WordPiece · Label Smoothing · Linear Warmup With Linear Decay · Position-Wise Feed-Forward Layer · Linear Layer · Absolute Position Encodings · Cosine Annealing · Multi-Head Attention
