Embedding Software Intent: Lightweight Java Module Recovery
Yirui He, Yuqi Huai, Xingyu Chen, Joshua Garcia

TL;DR
This paper introduces ClassLAR, a lightweight language model-based method for recovering Java modules from monolithic systems, significantly improving accuracy and speed over existing techniques.
Contribution
It proposes a novel approach using language models to effectively recover Java modules, addressing limitations of prior architecture recovery methods.
Findings
Outperformed state-of-the-art techniques in similarity metrics
Achieved 3.99 to 10.50 times faster execution times
Successfully recovered modules in 20 Java projects
Abstract
As an increasing number of software systems reach unprecedented scale, relying solely on code-level abstractions is becoming impractical. While architectural abstractions offer a means to manage these systems, maintaining their consistency with the actual code has been problematic. The Java Platform Module System (JPMS), introduced in Java 9, addresses this limitation by enabling explicit module specification at the language level. JPMS enhances architectural implementation through improved encapsulation and direct specification of ground-truth architectures within Java projects. Although many projects are written in Java, modularizing existing monolithic projects to JPMS modules is an open challenge due to ineffective module recovery by existing architecture recovery techniques. To address this challenge, this paper presents ClassLAR (Class-and Language model-based Architectural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Software Engineering Methodologies · Software System Performance and Reliability
