Code Vectorization and Sequence of Accesses Strategies for Monolith Microservices Identification
Vasco Faria, Ant\'onio Rito Silva

TL;DR
This paper enhances monolith-to-microservices migration by applying code vectorization with Code2Vec to improve functionality grouping based on access sequences, evaluated across multiple codebases.
Contribution
It introduces two novel strategies that incorporate code vectorization into access sequence analysis for better microservices decomposition.
Findings
Vectorization improves decomposition quality metrics.
Proposed strategies outperform baseline sequence approach.
Evaluation across diverse codebases confirms effectiveness.
Abstract
Migrating a monolith application into a microservices architecture can benefit from automation methods, which speed up the migration and improve the decomposition results. One of the current approaches that guide software architects on the migration is to group monolith domain entities into microservices, using the sequences of accesses of the monolith functionalities to the domain entities. In this paper, we enrich the sequence of accesses solution by applying code vectorization to the monolith, using the \textit{Code2Vec} neural network model. We apply \textit{Code2Vec} to vectorize the monolith functionalities. We propose two strategies to represent a functionality, one by aggregating its call graph methods vectors, and the other by extending the sequence of accesses approach with vectorization of the accessed entities. To evaluate these strategies, we compare the proposed strategies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Engineering Research · Cloud Computing and Resource Management
