Supporting and Controlling Complex Concurrency in Fault- Tolerant Distributed Systems
Jie Xu, Brian Randell, Alexander Romanovsky, Robert J. Stroud, and Avelino F. Zorzo

TL;DR
This paper analyzes and enhances models for managing complex concurrency in fault-tolerant distributed systems, addressing cooperation and competition among activities with solutions for consistent shared resource access.
Contribution
It introduces and evaluates the CA actions model for supporting both cooperative and competitive concurrency in fault-tolerant distributed systems.
Findings
Identifies key problems in shared object access in concurrent activities
Proposes solutions for consistent access in joint and independent actions
Clarifies issues in the CA actions model for fault tolerance
Abstract
Distributed computing often gives rise to complex concurrent and interacting activities. In some cases several concurrent activities may be working together, i.e. cooperating, to solve a given problem; in other cases, the activities may be independent but needing to share common system resources for which they must compete. Many difficulties and limitations occur in the widely advocated objects and (trans)actions model when it is supposed to support cooperating activities. We have introduced previously the concept of coordinated atomic (CA) actions [Xu et al. 1995]; this paper analyzes and examines the derived objects and CA actions model for constructing fault-tolerant distributed systems and providing unified support for both cooperative and competitive concurrency. Our investigation reveals and clarifies several significant problems that have not previously been studied extensively,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Cloud Computing and Resource Management · Distributed and Parallel Computing Systems
