# Scalable Inference of System-level Models from Component Logs

**Authors:** Donghwan Shin, Salma Messaoudi, Domenico Bianculli, Annibale, Panichella, Lionel Briand, Raimondas Sasnauskas

arXiv: 1908.02329 · 2020-04-17

## TL;DR

This paper introduces SCALER, a scalable method for inferring system-level behavioral models from component logs, even with incomplete communication information, by combining component models based on high-level architecture dependencies.

## Contribution

SCALER is a novel divide-and-conquer approach that infers component models separately and merges them, improving scalability and accuracy over existing techniques.

## Key findings

- SCALER processes larger logs than state-of-the-art tools.
- SCALER produces more accurate system models.
- Evaluation on industrial logs demonstrates effectiveness.

## Abstract

Behavioral software models play a key role in many software engineering tasks; unfortunately, these models either are not available during software development or, if available, they quickly become outdated as the implementations evolve. Model inference techniques have been proposed as a viable solution to extract finite-state models from execution logs. However, existing techniques do not scale well when processing very large logs, such as system-level logs obtained by combining component-level logs. Furthermore, in the case of component-based systems, existing techniques assume to know the definitions of communication channels between components. However, this information is usually not available in the case of systems integrating 3rd-party components with limited documentation. In this paper, we address the scalability problem of inferring the model of a component-based system from the individual component-level logs, when the only available information about the system are high-level architecture dependencies among components and a (possibly incomplete) list of log message templates denoting communication events between components. Our model inference technique, called SCALER, follows a divide and conquer approach. The idea is to first infer a model of each system component from the corresponding logs; then, the individual component models are merged together taking into account the dependencies among components, as reflected in the logs. We evaluated SCALER in terms of scalability and accuracy, using a dataset of logs from an industrial system; the results show that SCALER can process much larger logs than a state-of-the-art tool, while yielding more accurate models.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.02329/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/1908.02329/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1908.02329/full.md

---
Source: https://tomesphere.com/paper/1908.02329