PolyFrame: A Retargetable Query-based Approach to Scaling DataFrames (Extended Version)
Phanwadee Sinthong, Michael J. Carey

TL;DR
PolyFrame extends the AFrame data analysis library to support multiple query-based database systems, enabling scalable data analysis across diverse data management platforms with a flexible, retargetable query formation approach.
Contribution
It introduces a new design that allows AFrame's query formation to be retargeted to various database systems, enhancing flexibility and scalability.
Findings
Supports multiple database systems beyond AsterixDB
Enables scalable data analysis with flexible query formation
Improves data analyst productivity in large-scale data environments
Abstract
In the last few years, the field of data science has been growing rapidly as various businesses have adopted statistical and machine learning techniques to empower their decision making and applications. Scaling data analysis, possibly including the application of custom machine learning models, to large volumes of data requires the utilization of distributed frameworks. This can lead to serious technical challenges for data analysts and reduce their productivity. AFrame, a Python data analytics library, is implemented as a layer on top of Apache AsterixDB, addressing these issues by incorporating the data scientists' development environment and transparently scaling out the evaluation of analytical operations through a Big Data management system. While AFrame is able to leverage data management facilities (e.g., indexes and query optimization) and allows users to interact with a very…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Scientific Computing and Data Management · Data Management and Algorithms
