Coo: Rethink Data Anomalies In Databases
Haixiang Li, Xiaoyan Li, Yuxing Chen, Yuean Zhu, Xiaoyong Du, Wei Lu,, Chang Liu, Anqun Pan

TL;DR
This paper introduces Coo, a comprehensive framework for systematically defining and analyzing data anomalies in databases, addressing ambiguities and formalizing the concept to improve understanding and control.
Contribution
Coo provides a complete, formalized approach to define data anomalies, classifies infinite anomalies, and introduces new isolation levels with quantitative analysis of concurrency control algorithms.
Findings
Existing data anomalies are only a subset of all possible anomalies.
Coo is theoretically complete in formalizing data anomalies.
Quantitative analysis of concurrency and rollback rates in mainstream algorithms.
Abstract
Transaction processing technology has three important contents: data anomalies, isolation levels, and concurrent control algorithms. Concurrent control algorithms are used to eliminate some or all data anomalies at different isolation levels to ensure data consistency. Isolation levels in the current ANSI standard are defined by disallowing certain kinds of data anomalies. Yet, the definitions of data anomalies in the ANSI standard are controversial. On one hand, the definitions lack a mathematical formalization and cause ambiguous interpretations. On the other hand, the definitions are made in a case-by-case manner and lead to a situation that even a senior DBA could not have infallible knowledge of data anomalies, due to a lack of a full understanding of its nature. While revised definitions in existing literature propose various mathematical formalizations to correct the former…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Advanced Data Storage Technologies · Software System Performance and Reliability
