Corrigendum to "Counting Database Repairs that Satisfy Conjunctive Queries with Self-Joins"
Jef Wijsen

TL;DR
This paper corrects a false lemma in a previous work on counting database repairs satisfying conjunctive queries with self-joins, providing a new proof for the main theorem.
Contribution
It offers a corrected proof for a key theorem in database repair counting, addressing an error in prior literature.
Findings
The original lemma was false, affecting the proof of the main theorem.
A new, valid proof for Theorem 3 is provided.
Ensures correctness of counting repairs in databases with self-joins.
Abstract
The helping Lemma 7 in [Maslowski and Wijsen, ICDT, 2014] is false. The lemma is used in (and only in) the proof of Theorem 3 of that same paper. In this corrigendum, we provide a new proof for the latter theorem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Complexity and Algorithms in Graphs · Cryptography and Data Security
Corrigendum to “Counting Database Repairs that Satisfy Conjunctive Queries with Self-Joins”
Jef Wijsen
University of Mons, Belgium
Abstract
The helping Lemma 7 in [Maslowski and Wijsen, ICDT, 2014] is false. The lemma is used in (and only in) the proof of Theorem 3 of that same paper. In this corrigendum, we provide a new proof for the latter theorem.
1 The Flaw
The helping Lemma 7 in [MW14] is false. A counterexample is given next.
Example 1**.**
For and , , we have , . From [MW14, Lemma 8], it follows that is -hard. From [MW13, Theorem 4], it follows that is in . Consequently, assuming , there exists no polynomial-time many-one reduction from to . Lemma 7 in [MW14] is thus false. ∎
The first part in the proof of Lemma 7 in [MW14] is correct; it shows a polynomial-time many-one reduction from to . However, the second part in that proof is flawed when it claims “We can compute in polynomial time the (unique) database with schema such that .” The flaw is that the database does not generally exist, as shown next. Let and , , as in Example 1. Then, , . A legal input to is , , . However, there exists no database such that . Indeed, for every database with schema , if , then .
2 The Solution
The following treatment is relative to a database schema . Let be non-negative integers such that every relation name in has at most primary-key positions, and at most non-primary-key positions. We define a new function which encodes Boolean conjunctive queries into unirelational Boolean conjunctive queries. For , we use a fresh relation name with primary-key positions, and non-primary-key positions. For every atom in , the query will contain some atom , where is a sequence of padding zeros, and is a sequence of padding fresh variables, all distinct and not occurring elsewhere. This encoding is different from [MW14, Definition 3] where a sequence of padding zeros was used instead of .
Example 2**.**
We illustrate the difference between the old encoding of [MW14, Definition 3] and the newly proposed encoding . For , , we have
[TABLE]
We recall from [MW14, p. 156] that the complex part of a Boolean conjunctive query contains every atom such that some non-primary-key position in contains either a variable with two or more occurrences in or a constant. Note that belongs to the complex part of , while is not in the complex part of . ∎
Definition 1**.**
We define skBCQ as the class of Boolean conjunctive queries in which all relation names are simple-key. We say that a query is minimal if both
- •
contains no two distinct atoms , such that and ; and
- •
there exists no substitution over such that .
We define cxBCQ as the class of unirelational Boolean conjunctive queries whose relation name has signature (for some ) such that for every , the first position of is a constant.
Definition 2**.**
The intersection graph of a Boolean conjunctive query is an undirected graph whose vertices are the atoms of . There is an undirected edge between any two atoms that have a variable in common.
Lemma 1**.**
Assume . For every minimal query in skBCQ, if is -hard, then so is .
Proof.
Let be a minimal query in skBCQ such that is -hard. Note that does not need to be unirelational or self-join-free. The query , which is unirelational, is a legal input to the function IsEasy of [MW14, p. 163].222For uniformity of notation, we will assume that the unirelational query uses relation name . Since is -hard, the function IsEasy will return on input . This function will repeat, as long as possible, the following step: pick some atom and some variable , with some relation name (treated as a constant) and some constant, and replace all occurrences of with an arbitrary constant. Let be the query that results from these steps. Clearly, for every atom in , either is a constant or is variable-free. Since IsEasy returns on input , it follows that does not satisfy the premise of [MW14, Lemma 5]. Therefore, it must be the case that contains two distinct atoms and that are connected in the intersection graph of such that
- •
and are relation names (serving as constants), not necessarily distinct;
- •
and are distinct variables; and
- •
neither nor is exclusively composed of variables occurring only once in the query. That is, and belong to the complex part of .
For every relation name that appears in , we assume fresh relation names with the same signature as . Using these relation names, we can construct a self-join-free Boolean conjunctive query such that and for every atom in , the query contains some atom . For example, if , , , then we can let , , . It can now be shown that the function IsSafe in [MW14, p. 158] will return on input , and thus is -hard. Indeed, whenever IsEasy picked and some variable , the function IsSafe can execute SE3 on the corresponding -atom of . This eventually leads to a query whose complex part contains two atoms and , , that are connected in the intersection graph, at which point IsSafe will return . In this reasoning, one needs that non-primary-key positions are padded with fresh variables occurring only once, as can be seen from Example 2.
In the remainder of this proof, we show the existence of a polynomial-time many-one reduction from to . We incidentally note that the remaining reasoning, which generalizes the proof of [MW14, Lemma 2], does not require that relation names are simple-key.
Let be a mapping from facts to facts such that for every atom , for every -fact , . Notice that maps -facts to -facts. Here, every couple denotes a constant such that if and only if both and . Moreover, if is a constant, then . Since no two distinct atoms of agree on both their relation name and primary key, it will be the case that for all facts and , if and only if , where denotes “is key-equal-to.”
We extend the function in the natural way to databases that use only relation names from : . Clearly, can be computed in polynomial time in the size of . Let be a set of facts with relation names in . It can be easily seen that and . Let be an arbitrary repair of . It suffices to show that
[TABLE]
For the implication , assume that . We can assume a valuation over such that . Let be the valuation such that for every variable , . By our construction of and , it will be the case that , thus .
For the implication , assume that . We can assume a valuation over such that . Notice that if is a constant in , then it must be the case that . We define as the substitution that maps every variable in to the first coordinate of ; and maps every to the second coordinate of . It is convenient to think of and as references to the Left and the Right coordinates, respectively. Thus, by definition, .
By inspecting the right-hand coordinates of couples in , it can be easily seen that implies . Since the query is minimal, it follows that , i.e., is an automorphism. Since the inverse of an automorphism is an automorphism, is an automorphism as well. Note that will be the identity on constants that appear in . We now define (i.e., is the composed function after the inverse of ), and show that , which implies the desired result that . To this extent, let be an arbitrary atom of . It suffices to show , which can be proved as follows. From , it follows . Thus, since is an automorphism,
[TABLE]
Since ,
[TABLE]
Since, for every symbol , and , we obtain
[TABLE]
That is, by our definition of ,
[TABLE]
From this, it is correct to conclude that . This concludes the proof. ∎
Lemma 2**.**
For every Boolean conjunctive query , there exists a polynomial-time many-one reduction from to .
Proof.
Let be a Boolean conjunctive query. Let be a relation name that occurs in . Let be the set of -atoms of . Then, will contain, for every , some atom , where is a (possibly empty) sequence of distinct fresh variables not occurring elsewhere. For every -fact , we define . Note here that depends on the signatures of and , but not on the -atoms of . The mapping is defined similarly for all relation names that appear in . It can be easily seen that for all facts and whose relation names appear in , if and only if .
If is an instance of , we can assume without loss of generality that every relation name in also appears in . We extend the function to such instances of : . Obviously, can be computed in polynomial time in the size of . It is also obvious that and . It suffices to show that for every repair of ,
[TABLE]
For the implication , assume . We can assume a valuation over such that . Let be the valuation that extends from to such that for every variable that appears in but not in . By the construction of , it will be the case that . Indeed, if contains , then will contain , hence will contain and .
For the implication , assume . We can assume a valuation over such that . It is straightforward to see that . ∎
We now give the new proof for Theorem 3 in [MW14].
Theorem 1** ([MW14, Theorem 3]).**
The set exhibits an effective --dichotomy.
New proof.
Let . It can be decided whether can be satisfied by a consistent database. If cannot be satisfied by a consistent database, then for every database , the number of repairs of satisfying is [math]. An example is , . Assume next that can be satisfied by a consistent database. Then, we can compute a minimal query such that for every database, the number of repairs satisfying is equal to the number of repairs satisfying . That is, the problems and are identical.
Then, belongs to cxBCQ. By [MW14, Lemma8], the set exhibits an effective --hard dichotomy. If the problem is in , then is in by Lemma 2; and if is -hard, then is -hard by Lemma 1. Consequently, is in or -hard, and it is is decidable which of the two cases applies. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[MW 13] Dany Maslowski and Jef Wijsen. A dichotomy in the complexity of counting database repairs. J. Comput. Syst. Sci. , 79(6):958–983, 2013.
- 2[MW 14] Dany Maslowski and Jef Wijsen. Counting database repairs that satisfy conjunctive queries with self-joins. In Nicole Schweikardt, Vassilis Christophides, and Vincent Leroy, editors, Proc. 17th International Conference on Database Theory (ICDT), Athens, Greece, March 24-28, 2014. , pages 155–164. Open Proceedings.org, 2014.
