Instruction-Tuned LLMs for Parsing and Mining Unstructured Logs on Leadership HPC Systems
Ahmad Maroof Karimi, Jong Youl Choi, Charles Qing Cao, Awais Khan

TL;DR
This paper introduces a domain-adapted, instruction-following LLM framework for parsing and mining unstructured HPC system logs, achieving high accuracy and practical utility in large-scale log analysis.
Contribution
It presents a hybrid fine-tuning methodology for adapting LLMs to HPC log data, enabling efficient, privacy-preserving log parsing comparable to larger models.
Findings
Achieves parsing accuracy comparable to larger models like LLaMA 70B.
Successfully analyzed over 600 million production logs from Frontier supercomputer.
Uncovered operational patterns, anomalies, and workload-error correlations.
Abstract
Leadership-class HPC systems generate massive volumes of heterogeneous, largely unstructured system logs. Because these logs originate from diverse software, hardware, and runtime layers, they exhibit inconsistent formats, making structure extraction and pattern discovery extremely challenging. Therefore, robust log parsing and mining is critical to transform this raw telemetry into actionable insights that reveal operational patterns, diagnose anomalies, and enable reliable, efficient, and scalable system analysis. Recent advances in large language models (LLMs) offer a promising new direction for automated log understanding in leadership-class HPC environments. To capitalize on this opportunity, we present a domain-adapted, instruction-following, LLM-driven framework that leverages chain-of-thought (CoT) reasoning to parse and structure HPC logs with high fidelity. Our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
