Parser-Free Querying of Security Logs

Evan Luo; Julien Piet; David Wagner

arXiv:2605.22027·cs.CR·May 22, 2026

Parser-Free Querying of Security Logs

Evan Luo, Julien Piet, David Wagner

PDF

TL;DR

Sieve is a system that enables security analysts to generate executable queries directly from natural language questions on raw logs, reducing the need for manual parsing and scripting.

Contribution

It introduces a method that uses a large language model with lightweight log-format context to produce accurate, executable queries from natural language, improving efficiency and accuracy.

Findings

01

Over 3x reduction in error rate on complex queries

02

Largest gains on multi-line correlation tasks

03

Effective bridging of structured querying and raw log analysis

Abstract

Security analysts routinely query system logs to detect threats and investigate incidents, but each log source uses its own semi-structured format: logs are cheap to produce, but expensive to use. The standard approach, building per-source parsers to normalize logs into structured schemas, is powerful but requires continuous engineering effort for each new format. Querying raw logs directly with tools like grep avoids this cost, but requires analysts to know each source's message variants and cannot express the multi-line temporal queries that security investigations demand. We present Sieve, a system that generates executable query code from natural-language security questions by grounding a large language model with lightweight, automatically extracted log-format context, requiring only one LLM call per query followed by deterministic execution. Evaluating 133 security queries across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.