TL;DR
This paper evaluates two community detection algorithms for identifying software components in Java systems, focusing on their structural cohesion and semantic separation to improve component extraction.
Contribution
It compares the effectiveness of Leiden and Infomap algorithms in extracting semantically cohesive and well-separated software components from Java systems.
Findings
Leiden produces better-separated, less cohesive components.
Infomap creates more cohesive, overlapping clusters.
Both algorithms show good overall performance.
Abstract
Component Based Software Engineering (CBSE) seeks to promote the reuse of software by using existing software modules into the development process. However, the availability of such a reusable component is not immediate and is costly and time consuming. As an alternative, the extraction from pre-existing OO software can be considered. In this work, we evaluate two community detection algorithms for the task of software components identification. Considering `components' as `communities', the aim is to evaluate how independent, yet cohesive, the components are when extracted by structurally informed algorithms. We analyze 412 Java systems and evaluate the cohesion of the extracted communities using four document representation techniques. The evaluation aims to find which algorithm extracts the most semantically cohesive, yet separated communities. The results show a good…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
