Execution replay and debugging

Michiel Ronsse; Koen De Bosschere; Jacques Chassin de Kergommeaux

arXiv:cs/0011006·cs.SE·May 23, 2007·AADEBUG·37 cites

Execution replay and debugging

Michiel Ronsse, Koen De Bosschere, Jacques Chassin de Kergommeaux

PDF

Open Access

TL;DR

This paper surveys execution replay techniques essential for debugging non-deterministic parallel and distributed programs, highlighting methods to record and replay execution flows efficiently.

Contribution

It provides a comprehensive overview of existing execution replay techniques and tools, emphasizing their approaches to balancing recording detail and performance.

Findings

01

Various execution replay methods are compared and categorized.

02

Trade-offs between recording detail and performance are analyzed.

03

The survey identifies gaps and future directions in execution replay technology.

Abstract

As most parallel and distributed programs are internally non-deterministic -- consecutive runs with the same input might result in a different program flow -- vanilla cyclic debugging techniques as such are useless. In order to use cyclic debugging tools, we need a tool that records information about an execution so that it can be replayed for debugging. Because recording information interferes with the execution, we must limit the amount of information and keep the processing of the information fast. This paper contains a survey of existing execution replay techniques and tools.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Distributed systems and fault tolerance