GenIE - Simulator-Driven Iterative Data Exploration for Scientific Discovery
Ashwin Gerard Colaco, Martin Boissier, Sriram Rao, Shubharoop Ghosh, Sharad Mehrotra, Tilmann Rabl

TL;DR
GenIE is a novel database system extension that integrates multiple physics-based simulators to enable dynamic, interactive, and iterative scientific data exploration, significantly improving efficiency over traditional linear workflows.
Contribution
This work introduces GenIE, a simulation-aware database extension for PostgreSQL that orchestrates and reuses simulation data to facilitate interactive scientific analysis.
Findings
Enables real-time, iterative analysis with multiple simulators.
Reduces redundant simulations by reusing existing data.
Transforms slow analyses into interactive explorations.
Abstract
Physics-based simulators play a critical role in scientific discovery and risk assessment, enabling what-if analyses for events like wildfires and hurricanes. Today, databases treat these simulators as external pre-processing steps. Analysts must manually run a simulation, export the results, and load them into a database before analysis can begin. This linear workflow is inefficient, incurs high latency, and hinders interactive exploration, especially when the analysis itself dictates the need for new or refined simulation data. We envision a new database paradigm, entitled GenIE, that seamlessly integrates multiple simulators into databases to enable dynamic orchestration of simulation workflows. By making the database "simulation-aware," GenIE can dynamically invoke simulators with appropriate parameters based on the user's query and analytical needs. This tight integration allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Visualization and Analytics · Distributed and Parallel Computing Systems
