An efficient high-quality hierarchical clustering algorithm for automatic inference of software architecture from the source code of a software system
Sarge Rogatch

TL;DR
This paper presents a high-quality hierarchical clustering algorithm that automatically infers software architecture from source code, enabling better comprehension and maintenance of large, complex object-oriented systems.
Contribution
The paper introduces a novel efficient hierarchical clustering algorithm that reconstructs software architecture at the subsystem level from source code, aiding comprehension and maintenance.
Findings
Reconstructs architectural diagrams from source code automatically
Reduces software maintenance costs significantly
Enables understanding of complex software systems at subsystem level
Abstract
It is a high-quality algorithm for hierarchical clustering of large software source code. This effectively allows to break the complexity of tens of millions lines of source code, so that a human software engineer can comprehend a software system at high level by means of looking at its architectural diagram that is reconstructed automatically from the source code of the software system. The architectural diagram shows a tree of subsystems having OOP classes in its leaves (in the other words, a nested software decomposition). The tool reconstructs the missing (inconsistent/incomplete/inexistent) architectural documentation for a software system from its source code. This facilitates software maintenance: change requests can be performed substantially faster. Simply speaking, this unique tool allows to lift the comprehensible grain of object-oriented software systems from OOP class-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Advanced Software Engineering Methodologies
