Towards the Automated Extraction and Refactoring of NoSQL Schemas from Application Code
Carlos J. Fernandez-Candel, Anthony Cleve, Jesus J. Garcia-Molina

TL;DR
This paper introduces a static analysis method to automatically extract and refactor NoSQL schemas from application code, enabling better schema understanding and optimization.
Contribution
It proposes a platform-independent model-driven approach for extracting and refactoring NoSQL schemas from application code, including detection of join-like patterns and duplication strategies.
Findings
Accurately extracts schemas from NoSQL applications
Identifies refactoring opportunities such as join elimination
Validates approach with a MongoDB application experiment
Abstract
In this paper, we present a static code analysis strategy to extract logical schemas from NoSQL applications. Our solution is based on a model-driven reverse engineering process composed of a chain of platform-independent model transformations. The extracted schema conforms to the U-Schema unified metamodel, which can represent both NoSQL and relational schemas. To support this process, we define a metamodel capable of representing the core elements of object-oriented languages. Application code is first injected into a code model, from which a control flow model is derived. This, in turn, enables the generation of a model representing both data access operations and the structure of stored data. From these models, the U-Schema logical schema is inferred. Additionally, the extracted information can be used to identify refactoring opportunities. We illustrate this capability through the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Service-Oriented Architecture and Web Services · Advanced Database Systems and Queries
