LLM meets ML: Data-efficient Anomaly Detection on Unstable Logs

Fatemeh Hadadi; Qinghua Xu; Domenico Bianculli; Lionel Briand

arXiv:2406.07467·cs.SE·October 10, 2025·3 cites

LLM meets ML: Data-efficient Anomaly Detection on Unstable Logs

Fatemeh Hadadi, Qinghua Xu, Domenico Bianculli, Lionel Briand

PDF

Open Access

TL;DR

This paper introduces FlexLog, a hybrid anomaly detection method for unstable logs that combines ML models with a Large Language Model, achieving higher accuracy with less data and maintaining fast inference times.

Contribution

FlexLog is a novel hybrid approach integrating ML models with a Large Language Model for efficient anomaly detection on unstable logs, reducing data requirements and improving performance.

Findings

01

FlexLog outperforms all baselines by at least 1.2 pp in F1 score.

02

FlexLog uses significantly less labeled data—62.87 pp reduction.

03

FlexLog maintains inference time under one second per log sequence.

Abstract

Most log-based anomaly detectors assume logs are stable, though logs are often unstable due to software or environmental changes. Anomaly detection on unstable logs (ULAD) is therefore a more realistic, yet under-investigated challenge. Current approaches predominantly employ machine learning (ML) models, which often require extensive labeled data for training. To mitigate data insufficiency, we propose FlexLog, a novel hybrid approach for ULAD that combines ML models -- decision tree, k-nearest neighbors, and a feedforward neural network -- with a Large Language Model (Mistral) through ensemble learning. FlexLog also incorporates a cache and retrieval-augmented generation (RAG) to further enhance efficiency and effectiveness. To evaluate FlexLog, we configured four datasets for \task, namely ADFA-U, LOGEVOL-U, SynHDFS-U, and SYNEVOL-U. FlexLog outperforms all baselines by at least 1.2…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification