Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes
Tianyuan Wu, Chaokun Chang, Lunxi Cao, Wei Gao, Wei Wang

TL;DR
Crab is a semantics-aware runtime that improves checkpoint/restore efficiency and correctness for agent sandboxes by bridging the semantic gap between agent frameworks and OS effects using eBPF.
Contribution
Crab introduces a host-side runtime that classifies OS effects and optimizes checkpointing without modifying agents or backends, enhancing fault tolerance and efficiency.
Findings
Recovery correctness improved from 8% to 100%.
Checkpoint traffic reduced by up to 87%.
Execution time overhead remains within 1.9%.
Abstract
Autonomous agents act through sandboxed containers and microVMs whose state spans filesystems, processes, and runtime artifacts. Checkpoint and restore (C/R) of this state is needed for fault tolerance, spot execution, RL rollout branching, and safe rollback-yet existing approaches fall into two extremes: application-level recovery preserves chat history but misses OS-side effects, while full per-turn checkpointing is correct but too expensive under dense co-location. The root cause is an agent-OS semantic gap: agent frameworks see tool calls but not their OS effects; the OS sees state changes but lacks turn-level context to judge recovery relevance. This gap hides massive sparsity: over 75% of agent turns produce no recovery-relevant state, so most checkpoints are unnecessary. Crab (Checkpoint-and-Restore for Agent SandBoxes) is a transparent host-side runtime that bridges this gap…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
