# Scaling Distributed Transaction Processing and Recovery based on   Dependency Logging

**Authors:** Chang Yao, Meihui Zhang, Qian Lin, Beng Chin Ooi, Jiatao Xu

arXiv: 1703.02722 · 2017-03-09

## TL;DR

This paper introduces a unified approach to distributed transaction management and recovery using dependency graphs, significantly improving performance and recovery speed in multi-node in-memory systems.

## Contribution

It proposes a novel Distributed Dependency Graph based Concurrency Control (DistDGCC) protocol and Dependency Logging, integrating transaction processing and recovery for enhanced efficiency.

## Key findings

- DistDGCC reduces transaction processing costs and thread blocking.
- Dependency logging enables fast recovery with minimal overhead.
- System performance is significantly improved through the proposed methods.

## Abstract

DGCC protocol has been shown to achieve good performance on multi-core in-memory system. However, distributed transactions complicate the dependency resolution, and therefore, an effective transaction partitioning strategy is essential to reduce expensive multi-node distributed transactions. During failure recovery, log must be examined from the last checkpoint onwards and the affected transactions are re-executed based on the way they are partitioned and executed. Existing approaches treat both transaction management and recovery as two separate problems, even though recovery is dependent on the sequence in which transactions are executed.   In this paper, we propose to treat the transaction management and recovery problems as one. We first propose an efficient Distributed Dependency Graph based Concurrency Control (DistDGCC) protocol for handling transactions spanning multiple nodes, and propose a new novel and efficient logging protocol called Dependency Logging that also makes use of dependency graphs for efficient logging and recovery. DistDGCC optimizes the average cost for each distributed transaction by processing transactions in batch. Moreover, it also reduces the effects of thread blocking caused by distributed transactions and consequently improves the runtime performance. Further, dependency logging exploits the same data structure that is used by DistDGCC to reduce the logging overhead, as well as the logical dependency information to improve the recovery parallelism. Extensive experiments are conducted to evaluate the performance of our proposed technique against state-of-the-art techniques. Experimental results show that DistDGCC is efficient and scalable, and dependency logging supports fast recovery with marginal runtime overhead. Hence, the overall system performance is significantly improved as a result.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.02722/full.md

## Figures

55 figures with captions in the complete paper: https://tomesphere.com/paper/1703.02722/full.md

## References

51 references — full list in the complete paper: https://tomesphere.com/paper/1703.02722/full.md

---
Source: https://tomesphere.com/paper/1703.02722