Asynchronous Snapshots of Actor Systems for Latency-Sensitive Applications
Dominik Aumayr, Stefan Marr, Elisa Gonzalez Boix, Hanspeter, M\"ossenb\"ock

TL;DR
This paper introduces a novel asynchronous snapshotting method for actor systems that minimizes latency impact by capturing partial states without stopping the application, supporting arbitrary object graphs, and enabling better deployment and debugging.
Contribution
It presents the first system for asynchronous, non-blocking snapshotting of actor applications, improving latency and operational flexibility.
Findings
Snapshotting increases slow requests by 0.007% at 100ms latency threshold.
The approach supports arbitrary object graphs and partial heap captures.
Performance varies with utilization patterns, affecting snapshotting costs.
Abstract
The actor model is popular for many types of server applications. Efficient snapshotting of applications is crucial in the deployment of pre-initialized applications or moving running applications to different machines, e.g for debugging purposes. A key issue is that snapshotting blocks all other operations. In modern latency-sensitive applications, stopping the application to persist its state needs to be avoided, because users may not tolerate the increased request latency. In order to minimize the impact of snapshotting on request latency, our approach persists the application's state asynchronously by capturing partial heaps, completing snapshots step by step. Additionally, our solution is transparent and supports arbitrary object graphs. We prototyped our snapshotting approach on top of the Truffle/Graal platform and evaluated it with the Savina benchmarks and the Acme Air…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
