Seven simple steps for log analysis in AI systems

Magda Dubois; Ekin Zorer; Maia Hamin; Joe Skinner; Alexandra Souly; Jerome Wynne; Harry Coppock; Lucas Sato; Sayash Kapoor; Sunishchal Dev; Keno Juchems; Kimberly Mai; Timo Flesch; Lennart Luettgau; Charles Teague; Eric Patey; JJ Allaire; Lorenzo Pacchiardi; Jose Hernandez-Orallo; Cozmin Ududec

arXiv:2604.09563·cs.AI·April 23, 2026

Seven simple steps for log analysis in AI systems

Magda Dubois, Ekin Zorer, Maia Hamin, Joe Skinner, Alexandra Souly, Jerome Wynne, Harry Coppock, Lucas Sato, Sayash Kapoor, Sunishchal Dev, Keno Juchems, Kimberly Mai, Timo Flesch, Lennart Luettgau, Charles Teague, Eric Patey, JJ Allaire, Lorenzo Pacchiardi

PDF

TL;DR

This paper proposes a standardized pipeline for log analysis in AI systems, illustrated with code examples and guidance to improve rigor and reproducibility.

Contribution

It introduces a comprehensive, best-practices-based framework for log analysis in AI, including code and guidance to address current inconsistencies.

Findings

01

Provides a detailed pipeline for log analysis in AI systems.

02

Includes concrete code examples in the Inspect Scout library.

03

Highlights common pitfalls and best practices.

Abstract

AI systems produce large volumes of logs as they interact with tools and users. Analysing these logs can help understand model capabilities, propensities, and behaviours, or assess whether an evaluation worked as intended. Researchers have started developing methods for log analysis, but a standardised approach is still missing. Here we suggest a pipeline based on current best practices. We illustrate it with concrete code examples in the Inspect Scout library, provide detailed guidance on each step, and highlight common pitfalls. Our framework provides researchers with a foundation for rigorous and reproducible log analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.