TL;DR
This paper presents a data-driven, deep learning approach to automatically assess the quality of log instructions in software systems, focusing on log level correctness and linguistic richness.
Contribution
It introduces the first automated method for log instruction quality assessment, based on analysis of properties and deep learning models, outperforming baseline approaches.
Findings
Achieves 0.88 accuracy in log level correctness assessment
F1 score of 0.99 in linguistic structure assessment
Demonstrates effectiveness on large-scale open-source systems
Abstract
In the current IT world, developers write code while system operators run the code mostly as a black box. The connection between both worlds is typically established with log messages: the developer provides hints to the (unknown) operator, where the cause of an occurred issue is, and vice versa, the operator can report bugs during operation. To fulfil this purpose, developers write log instructions that are structured text commonly composed of a log level (e.g., "info", "error"), static text ("IP {} cannot be reached"), and dynamic variables (e.g. IP {}). However, as opposed to well-adopted coding practices, there are no widely adopted guidelines on how to write log instructions with good quality properties. For example, a developer may assign a high log level (e.g., "error") for a trivial event that can confuse the operator and increase maintenance costs. Or the static text can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
