Towards a Holistic Integration of Spreadsheets with Databases: A Scalable Storage Engine for Presentational Data Management
Mangesh Bendre, Vipul Venkataraman, Xinyan Zhou, Kevin Chang, Aditya, Parameswaran

TL;DR
This paper introduces DataSpread, a storage engine that integrates spreadsheets with databases, enhancing scalability and interactivity by representing spreadsheet data efficiently and supporting fast positional access.
Contribution
It develops a flexible storage mechanism for spreadsheet data within databases, addressing the NP-hardness of optimal representation and providing efficient positional access methods.
Findings
Up to 20% reduction in storage requirements.
Up to 50% faster formula evaluation.
Constant time access and modification performance.
Abstract
Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make a first step towards this vision: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
