Towards Observation Lakehouses: Living, Interactive Archives of Software Behavior
Marcus Kessel

TL;DR
This paper introduces observation lakehouses, a scalable infrastructure for storing, analyzing, and evolving runtime behavior data of software, enabling behavior-aware evaluation and training of code-generating models.
Contribution
It presents a novel architecture combining continual Stimulus-Response Cubes with lakehouse technology for scalable, interactive, and persistent behavioral data analysis.
Findings
Ingests 8.6 million observations efficiently on a laptop.
Reconstructs behavior views and clusters in under 100ms.
Demonstrates practical behavior mining without distributed systems.
Abstract
Code-generating LLMs are trained largely on static artifacts (source, comments, specifications) and rarely on materializations of run-time behavior. As a result, they readily internalize buggy or mislabeled code. Since non-trivial semantic properties are undecidable in general, the only practical way to obtain ground-truth functionality is by dynamic observation of executions. In prior work, we addressed representation with Sequence Sheets, Stimulus-Response Matrices (SRMs), and Stimulus-Response Cubes (SRCs) to capture and compare behavior across tests, implementations, and contexts. These structures make observation data analyzable offline and reusable, but they do not by themselves provide persistence, evolution, or interactive analytics at scale. In this paper, therefore, we introduce observation lakehouses that operationalize continual SRCs: a tall, append-only observations table…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Environmental Monitoring and Data Management · Data Analysis with R
