LogPr\'ecis: Unleashing Language Models for Automated Malicious Log Analysis
Matteo Boffa, Rodolfo Vieira Valentim, Luca Vassio, Danilo Giordano,, Idilio Drago, Marco Mellia, Zied Ben Houidi

TL;DR
LogPrécis leverages advanced language models to automatically analyze and categorize large volumes of Unix shell attack logs, aiding cybersecurity experts in understanding attack sequences, detecting new threats, and improving defense strategies.
Contribution
This paper introduces LogPrécis, a novel methodology that uses language models to automatically interpret and abstract malicious shell logs, significantly reducing data complexity and enhancing attack analysis capabilities.
Findings
Successfully processed 400,000 attack sessions into 3,000 fingerprints
Enabled better understanding and tracking of attack sequences and mutations
Open-sourced the LogPrécis tool for community use
Abstract
The collection of security-related logs holds the key to understanding attack behaviors and diagnosing vulnerabilities. Still, their analysis remains a daunting challenge. Recently, Language Models (LMs) have demonstrated unmatched potential in understanding natural and programming languages. The question arises whether and how LMs could be also useful for security experts since their logs contain intrinsically confused and obfuscated information. In this paper, we systematically study how to benefit from the state-of-the-art in LM to automatically analyze text-like Unix shell attack logs. We present a thorough design methodology that leads to LogPr\'ecis. It receives as input raw shell sessions and automatically identifies and assigns the attacker tactic to each portion of the session, i.e., unveiling the sequence of the attacker's goals. We demonstrate LogPr\'ecis capability to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Engineering Research · Network Security and Intrusion Detection
