Eudoxia: a FaaS scheduling simulator for the composable lakehouse
Tapan Srivastava, Jacopo Tagliabue, Ciro Greco

TL;DR
Eudoxia is a deterministic simulator designed to evaluate and optimize scheduling algorithms for FaaS-based data lakehouse systems, facilitating experimentation and development in this emerging area.
Contribution
The paper introduces Eudoxia, a flexible and customizable simulator for scheduling data workloads in FaaS-enabled composable data lakehouses, addressing challenges in workload optimization.
Findings
Eudoxia effectively simulates diverse data workloads.
The simulator enables testing of various scheduling algorithms.
It provides a cost-effective platform for infrastructure evaluation.
Abstract
Due to the variety of its target use cases and the large API surface area to cover, a data lakehouse (DLH) is a natural candidate for a composable data system. Bauplan is a composable DLH built on "spare data parts" and a unified Function-as-a-Service (FaaS) runtime for SQL queries and Python pipelines. While FaaS simplifies both building and using the system, it introduces novel challenges in scheduling and optimization of data workloads. In this work, starting from the programming model of the composable DLH, we characterize the underlying scheduling problem and motivate simulations as an effective tools to iterate on the DLH. We then introduce and release to the community Eudoxia, a deterministic simulator for scheduling data workloads as cloud functions. We show that Eudoxia can simulate a wide range of workloads and enables highly customizable user implementations of scheduling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReservoir Engineering and Simulation Methods
