Complexity Thresholds in Inclusion Logic
Miika Hannula, Lauri Hella

TL;DR
This paper explores the computational complexity of inclusion logic, identifying specific fragments that correspond to complexity classes like NL and P, and establishing completeness results for certain problems.
Contribution
It establishes the complexity thresholds of inclusion logic fragments and connects them to well-known complexity classes in ordered models.
Findings
Quantifier-free inclusion logic formulas are complete for NL and P.
A fragment of inclusion logic captures NL in ordered models.
Complexity thresholds are characterized for various syntactical fragments.
Abstract
Logics with team semantics provide alternative means for logical characterization of complexity classes. Both dependence and independence logic are known to capture non-deterministic polynomial time, and the frontiers of tractability in these logics are relatively well understood. Inclusion logic is similar to these team-based logical formalisms with the exception that it corresponds to deterministic polynomial time in ordered models. In this article we examine connections between syntactical fragments of inclusion logic and different complexity classes in terms of two computational problems: maximal subteam membership and the model checking problem for a fixed inclusion logic formula. We show that very simple quantifier-free formulae with one or two inclusion atoms generate instances of these problems that are complete for (non-deterministic) logarithmic space and polynomial time.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Complexity Thresholds in Inclusion Logic
Miika Hannula
University of Helsinki, Finland
Lauri Hella
Tampere University, Finland
Abstract
Logics with team semantics provide alternative means for logical characterization of complexity classes. Both dependence and independence logic are known to capture non-deterministic polynomial time, and the frontiers of tractability in these logics are relatively well understood. Inclusion logic is similar to these team-based logical formalisms with the exception that it corresponds to deterministic polynomial time in ordered models. In this article we examine connections between syntactical fragments of inclusion logic and different complexity classes in terms of two computational problems: maximal subteam membership and the model checking problem for a fixed inclusion logic formula. We show that very simple quantifier-free formulae with one or two inclusion atoms generate instances of these problems that are complete for (non-deterministic) logarithmic space and polynomial time. Furthermore, we present a fragment of inclusion logic that captures non-deterministic logarithmic space in ordered models.
1 Introduction
In this article we study the computational complexity of inclusion logic. Inclusion logic was introduced by Galliani [9] as a variant of dependence logic, developed by Väänänen in 2007 [25]. Dependence logic is a logical formalism that extends first-order logic with novel atomic formulae expressing that a variable depends on variables . One motivation behind dependence logic is to find a unifying logical framework for analyzing dependency notions from different contexts. Since its introduction, versions of dependence logic have been formulated and investigated in a variety of logical environments, including propositional logic [15, 28, 30], modal logic [7, 26], probabilistic logics [5], and two-variable logics [21]. Recent research has also pursued connections and applications of dependence logic to fields such as database theory [13, 14], Bayesian networks [4], and social choice theory [23]. A common notion underlying all these endeavours is that of team semantics. Team semantics, introduced by Hodges in [16], is a semantical framework where formulae are evaluated over multitudes instead of singletons of objects as in classical logics. Depending on the application domain these multitudes may then refer to assignment sets, probability distributions, or database tables, each having their characteristic versions of team semantics [25, 5, 14].
After the introduction of dependence logic Grädel and Väänänen observed that team semantics can be also used to create logics for independence [11]. This was followed by [9] in which Galliani investigated logical languages built upon multiple different dependency notions. Inspired by the inclusion dependencies of database theory, one of the logics introduced was inclusion logic that extends first-order logic with inclusion atoms. Given two sequences of variables and having same length, an inclusion atom expresses that the set of values of is included in the set of values of . Inclusion logic was shown to be equi-expressive to positive greatest-fixed point logic in [10]. In contrast to dependence logic which is equivalent to existential second-order logic [25], and thus to non-deterministic polynomial time (), this finding established inclusion logic as the first team-based based logic for polynomial time (). Our focus in this article is to pursue this connection further by investigating the complexity of quantifier-free inclusion logic in terms of two computational problems: maximal subteam membership and model checking problems. In particular, we identify complexity thresholds for these problems in terms of first-order definability, (non-deterministic) logarithmic space, and polynomial time.
The maximal subteam membership problem for a formula asks whether a given assignment is in the maximal subteam of a given team that satisfies . This problem is closely related to the notion of a repair of an inconsistent database [2]. A repair of a database instance w.r.t. some set of constraints is an instance obtained by deleting and/or adding tuples from/to such that satisfies , and the difference between and is minimal according to some measure. If only deletion of tuples is allowed, is called a subset repair. It was observed in [3] that if consists of inclusion dependencies, then for every there exists a unique subset repair of ; this was later generalized to arbitrary LAV tgds (local-as-view tuple generating dependencies) in [24].
The research on database repair has been mainly focused on two problems: consistent query answering and repair checking. In the former, given a query and a database instance the problem is to compute the set of tuples that belong to for every repair of . The latter is the decision problem: is a repair of for two given database instances and . The complexity of these problems for various classes of dependencies and different types of repairs has been extensively studied in the literature; see e.g. [1, 3, 22, 24]. In this setting, the maximal subteam membership problem can be seen as a variant of the repair checking problem: regarding a team as a (unirelational) database instance and a formula of inclusion logic as a constraint, an assignment is a positive instance of just in case it is in the unique subset repair of . Note however, that in , the task is essentially to compute the maximal subteam from a given database instance , instead of just checking that a given is the unique subset repair of . Note further, that using a single formula as a constraint is actually more general than using a (finite) set of inclusion dependencies. Indeed, as we can take the conjunction of all inclusions in . Furthermore, using disjunctions and quantifiers, we can form constraints not expressible in the usual formalism with a set of dependencies.
The complexity of model checking in team semantics has been studied in [6, 20] for dependence and independence logics. For these logics increase in complexity arises particularly from disjunctions. For example, model checking for a disjunction of three (two, resp.) dependence atoms is complete for (, resp.), while a single dependence atom is first-order definable [20]. The results of this paper, in contrast, demonstrate that the complexity of inclusion logic formulae is particularly sensitive to conjunctions. We show that is complete for non-deterministic logarithmic space if is of the form or ; for any other conjunction of (non-trivial) unary inclusion atoms is complete for polynomial time. This result gives a complete characterization of the maximal subteam membership problem for conjunctions of unary inclusion atoms. Based on it we also prove complexity results for model checking of quantifier-free inclusion logic formulae. For instance, for any non-trivial quantifier-free in which do not occur, model checking of is -hard, while that of is -complete.
We also present a safety game for the maximal subteam membership problem. Using this game we examine instances of the maximal subteam membership problem in which the inclusion atoms refer to a key, that is, all inclusion atoms are of the form where is a variable which uniquely determines all the remaining variables. We give example formulae for which the thresholds between and drop down to and under these assumptions.
We conclude the paper by presenting a fragment of inclusion logic that captures . Analogous fragments have previously been established at least for dependence logic. By relating to the Horn fragment of existential second-order logic, Ebbing et al. define a fragment of dependence logic that corresponds to [8]. The fragment presented in this paper is constructed by restricting occurrences of inclusion atoms and universal quantifiers, and the correspondence with is shown by using the well-known characterization of in terms of transitive closure logic [18, 19].
2 Preliminaries
We generally use for variables and for elements of models. If and are two tuples, we write for the concatenation of and .
Throughout the paper we assume that the reader has a basic familiarity of computational complexity. We use the notation , , and for the classes consisting of all problems computable in logarithmic space, non-deterministic logarithmic space, polynomial time and non-deterministic polynomial time, respectively.
2.1 Team Semantics
As is customary for logics in the team semantics setting, we assume that all formulae are in negation normal form (NNF). Thus, we give the syntax of first-order logic () as follows:
[TABLE]
where and are terms and is a relation symbol of the underlying vocabulary. For a first-order formula , we denote by the set of free variables of , defined in the usual way. The team semantics of is given in terms of the notion of a team. Let be a model with domain . An assignment of is a function from a finite set of variables into . We write for the assignment that maps all variables according to , except that it maps to . For an assignment , we may use a shorthand if the underlying ordering of the domain is clear from the context. A team of with domain is a set of assignments from into . For , the restriction of a team is defined as . If is a team, , and , then denotes the team . For a set , is the team . Also, if is an assignment, then by we refer to Tarski semantics.
Definition 1**.**
For a model , a team and a formula in , the satisfaction relation is defined as follows:
- •
if , when is a literal,
- •
if ,
- •
if for some such that ,
- •
if for some ,
- •
if .
If , then we say that and satisfy . If does not contain any symbols from the underlying vocabulary, in which case satisfaction of a formula does not depend on the model , we say that satisfies , written , if for all models with a suitable domain (i.e., a domain that includes all the elements appearing in ). If is a sentence, that is, a formula without any free variables, then we say that satisfies , and write , if , where is the team that consists of the empty assignment .
We say that two sentences and are equivalent, written , if for all models . For two logics and , we write if every -sentence is equivalent to some -sentence; the relations “” and “” for and are defined in terms of “” in the standard way.
Satisfaction of a first-order formula reduces to Tarski semantics in the following way.
Proposition 2** (Flatness [25]).**
For all models , teams , and formulae ,
[TABLE]
A straightforward consequence is that first-order logic is downwards closed.
Corollary 3** (Downward Closure).**
For all models , teams , and formulae ,
[TABLE]
2.2 Inclusion Logic
Inclusion logic () is defined as the extension of by inclusion atoms.
Inclusion atom. Let and be two tuples of variables of the same length. Then is an inclusion atom with the satisfaction relation:
[TABLE]
Inclusion logic is local, meaning that satisfaction of a formula depends only on its free variables. Furthermore, the expressive power of inclusion logic is restricted by its union closure property which states that satisfaction of a formula is preserved under taking arbitrary unions of teams.
Proposition 4** (Locality [9]).**
Let be a model, a team, a formula, and a set of variables such that . Then
[TABLE]
Proposition 5** (Union Closure [9]).**
Let be a model, a set of teams, and a formula. Then
[TABLE]
Note that union closure implies the empty team property, that is, for all inclusion logic formulae .
The starting point for our investigations is the result by Galliani and Hella [10] characterizing the expressivity of inclusion logic in terms of positive greatest fixed point logic. The latter logic is obtained from greatest fixed-point logic (the dual of least fixed point logic) by restricting to formulae in which fixed point operators occur only positively, that is, within a scope of an even number of negations. In finite models this positive fragment captures the full fixed point logic (with both least and greatest fixed points), and hence it follows from the famous result of Immerman [17] and Vardi [27] that inclusion logic captures polynomial time in finite ordered models.
Theorem 6** ([10]).**
Every inclusion logic sentence is equivalent to some positive greatest fixed point logic sentence, and vice versa.
Theorem 7** ([10]).**
A class of finite ordered models is in iff it can be defined in .
2.3 Transitive Closure Logic
In Section 6 we will explore connections between inclusion logic and transitive closure logic, and hence we next give a short introduction to the latter. A -ary relation is said to be transitive if and imply for -tuples . The transitive closure of a -ary relation , written , is defined as the intersection of all -ary relations that are transitive. The transitive closure of can be alternatively defined as for defined recursively as follows: and for ; here denotes the composition of two relations and . Note that iff there is an -path of length from to .
An assignment , a model , and a formula , where and are -ary, give rise to a -ary relation defined as follows:
[TABLE]
We can now define transitive closure logic. Given a term , a model , and an assignment , we write for the interpretation of under , defined in the usual way.
Definition 8** (Transitive Closure Logic).**
Transitive closure logic () is obtained by extending first-order logic with transitive closure formulae where and are -tuples of terms, and is a formula where and are -tuples of variables. The semantics of the transitive closure formula is defined as follows:
[TABLE]
Thus, is true if and only if there is a -path from to . It is well known that transitive closure logic captures non-deterministic logarithmic space in finite ordered models. In particular, this can be achieved by using only one application of the operator. We use below the notation for the least element of the linear order, and for the tuple . Similarly, denotes the tuple , where is the greatest element.
Theorem 9** ([18, 19]).**
A class of finite ordered models is in iff it can be defined in . Furthermore, every -sentence is equivalent in finite ordered models to a sentence of the form
[TABLE]
where is first-order.
3 Maximal Subteam Membership
In this section we define the maximal subteam membership problem and discuss some of its basic properties. We also define a safety game for quantifier-free inclusion logic formulae. This game will be used later to facilitate some proofs regarding the complexity of the maximal subteam membership problem.
3.1 Introduction
For a model , a team , and an inclusion logic formula , we define as the unique subteam such that , and if . Due to the union closure property always exists and it can be alternatively defined as the union of all subteams such that . If does not contain any symbols from the underlying vocabulary, then we may write instead of . The maximal subteam membership problem is now given as follows.
Definition 10**.**
Let . Then is the problem of determining whether for a given model , a team and an assignment .
Grädel proved that for any -formula , there is a formula of positive greatest fixed point logic such that for any model and assignment , if and only if is in the maximal team of satisfying (see Theorem 24 in [12]). An easy adaptation of the proof shows that is also definable in positive greatest fixed point logic. Thus, it follows that every maximal subteam membership problem is polynomial time computable.
Lemma 11**.**
For every formula , is in .
In this section we will restrict our attention to maximal subteam problems for quantifier free formulae. Before proceeding to our findings we need to present some auxiliary concepts and results. The following lemmata will be useful below.
Lemma 12**.**
Let , and let be a team of a model . Then .
Proof.
For “”, note that by definition there are such that , and , and hence and . For “”, note that satisifes so it must be contained by . ∎
As an easy corollary we obtain the following lemma.
Lemma 13**.**
Let , and assume that and both belong to a complexity class . Then is in .
The maximal subteam problem for a single inclusion atom can be naturally represented using directed graphs. In this representation each assignment forms a vertex, and an assignment has an outgoing edge to another assignment if . Over finite teams an assignment then belongs to the maximal subteam for if and only if it is connected to a cycle.111We are grateful to Phokion Kolaitis, who pointed out this fact to the second author in a private discussion 2016.
Lemma 14**.**
Let be a model, a finite team, and two tuples of the same length from , an assignment of , and a first-order formula. Let be a directed graph where iff and . Then
- (a)
s\in\nu(\mathfrak{A},X,\overline{x}\subseteq\overline{y}\wedge\alpha)\iff G\text{ contains a path from s to a cycle }, 2. (b)
Proof.
Assume for the first statement that . Then there is a subteam such that and . Thus for each there exists such that . Moreover, , whence . In particular there is a non-ending path in starting from . Since is finite, this path necessarily ends in a cycle. Conversely, assume contains a path from to a cycle. Then where consists of all assignments in the path and cycle. Hence, .
For second statement note that, by the argument above, if and only if contains a path from to a cycle. But clearly an -path from to an -cycle is an -path from an -cycle to . ∎
3.2 Safety Game
In this section we present a version of a safety game for the maximal subteam problem of inclusion logic. Our presentation is also related to the safety games for inclusion logic examined in [12]. We present a safety game for a quadruple , written , where is an assignment of a team , and is a quantifier-free formula. The main result of the section shows that the maximal subteam problem over and can be characterized in terms of this game.
We assume that the reader is familiar with basic terminology on trees. We associate each quantifier-free with a labeled rooted tree such that the root of the tree is labeled by and each node labeled by or has two children labeled by and . Notice that two different nodes may have the same label. The safety game for can now be interpreted as a pebble game where assignments of a team form a stack of pebbles of which one at a time is placed on a node of . Legal moves of the game then consist of moving the pebble up and down through the tree, removing the pebble from a leaf, and placing a new pebble on a leaf. The starting position is to have placed on the root, and the winning condition for Player I is to arrive at a position where the game terminates. If no such position is ever reached, Player II wins.
Definition 15** (Safety Game).**
Let be quantifier-free, and let be an assignment in a team of a model . The safety game has two players I and II, and the game moves consist of positions and where and is a node of . The game starts with the position , where is the root, and given a position , the game proceeds as follows:
- (i)
If is labeled by a conjunction, then Player I selects a position where is a child of . 2. (ii)
If is labeled by a disjunction, then Player II selects a position where is a child of . 3. (iii)
If is labeled by a literal , then the game ends if . Otherwise, Player I selects a position such that is a descendant of . 4. (iv)
If is labeled by , then the game ends if there is no such that . Otherwise, Player I either
- •
selects a position such that is a descendant of , or
- •
selects the position .
Given a position , the game proceeds as follows:
- (v)
Player II selects a position such that .
Player I wins if the game ends after a finite number of moves by the players. Otherwise, Player II wins.
A strategy for a Player is a mapping on positions such that
- •
, for a non-leaf ,
- •
, for a leaf labeled by a literal,
- •
, for a leaf labeled by .
- •
, for a leaf labeled by .
Player has a winning strategy for if there is a strategy such that wins every game that she plays according to . That is, wins any game where she selects the position on her moves on .
Note that if does not contain any symbols from the underlying vocabulary, the outcome of is independent of , and thus we write instead.
Next we show that the safety game above gives rise to a characterization of the maximal subteam problem.
Theorem 16**.**
Let be quantifier-free, and let be an assignment in a team of a model . Then iff Player II has a winning strategy in .
Proof.
For the “only-if” direction, we define top-down recursively for each node a team such that
- •
for the root ,
- •
, for a child of a node where is labeled by .
It follows that for with label ; for conjunction-labeled with children ; and for disjunction-labeled with children . The strategy of Player II is now the following. If is labeled by disjunction, then is mapped to some where is a child of such that , and if is labeled by , then is mapped to some such that and . We leave it to the reader to check that this strategy is well-defined and winning.
For the “if” direction, assume Player II has a winning strategy . For a node of , we let be the set of all assignments for which there exists a game where Player II plays according to her winning strategy and position is played at some point of the game. Consider any assignment from for a node labeled by . This means there is a game where position is played, and thus also a game where , and furthermore is played. Consequently, an assignment exists such that . By analogous reasoning we obtain that for all other types of nodes with label , too. Furthermore, for conjunction-labeled with children , and for disjunction-labeled with children . In particular, and , and hence . ∎
Given that is finite, it makes sense to consider bounded length restrictions of the safety game. We let denote the version of in which, starting position excluded, positions of the form , i.e., pairs whose left element is an assignment and right element a node, are played at most times. Player I wins if the game terminates before such assignment-node pairs have been played times. Otherwise, if exactly plays of such nodes appear, Player II wins. The next lemma will be useful later.
Lemma 17**.**
Let be quantifier-free and such that has nodes, and let be an assignment of a team that is of size . Then Player II has a winning strategy for iff she has a winning strategy for .
Proof.
By the end of , positions of the form , the root position included, have occurred many times, i.e., some position has occurred twice. Every time such a repetition is encountered, we may assume that we continue the game from the first occurrence of . Since the strategy of Player II is safe for assignment-node moves, we conclude that never terminates. Hence, Player II wins. ∎
4 Complexity of Maximal Subteam Membership
Next we examine the computational complexity of maximal subteam membership. First in Section 4.1 we investigate this problem over arbitrary teams, and then in Section 4.1 we restrict attention to inputs in which the inclusion atoms refer to a key. In Section LABEL:sect:CQA we discuss the implications of our results to consistent query answering.
4.1 Arbitrary Teams
We give a complete characterization of the maximal subteam problem for arbitrary conjunctions of unary inclusion atoms. A unary inclusion atom is an atom of the form where and are single variables. The characterization is given in terms of inclusion graphs.
Definition 18**.**
Let be a set of unary inclusion atoms over variables in . Then the inclusion graph of is defined as such that iff and appears in .
We will now prove the following theorem.
Theorem 19**.**
Let be a finite set of unary inclusion atoms, and let be the conjunction of all atoms in . Then is
- (a)
trivially true if has no edges, 2. (b)
-complete if has an edge and no other edges except possibly for its inverse , 3. (c)
-complete otherwise.
The first statement above follows from the observation that is true for all inputs if is a conjunction of trivial inclusion atoms . The second statement is shown by relating to graph reachability. Given a directed graph and two vertices and , the problem REACH is to determine whether contains a path from to . This problem is a well-known complete problem for , and it will also be applied later in Section LABEL:sect:key where the complexity of over teams with keys and is examined.
Lemma 20**.**
* and are -complete. *
Proof.
Hardness. We give a logarithmic space many-one reduction from REACH. Let be a graph, and let . W.l.o.g. we can assume has no cycles. Define as the extension of with an extra edge . Then is reachable from in if and only if belongs to a cycle in . We reduce from to a team where maps to (see Fig. 4.1). By Lemma 14, is reachable from if and only if , where is either or .
Membership. By Lemma 14 and reduce to reachability variants that are clearly in . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Foto N. Afrati and Phokion G. Kolaitis. Repair checking in inconsistent databases: algorithms and complexity. In Database Theory - ICDT 2009, 12th International Conference, St. Petersburg, Russia, March 23-25, 2009, Proceedings , pages 31–41, 2009. URL: http://doi.acm.org/10.1145/1514894.1514899 , doi:10.1145/1514894.1514899 . · doi ↗
- 2[2] Marcelo Arenas, Leopoldo E. Bertossi, and Jan Chomicki. Consistent query answers in inconsistent databases. In Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 31 - June 2, 1999, Philadelphia, Pennsylvania, USA , pages 68–79, 1999. URL: http://doi.acm.org/10.1145/303976.303983 , doi:10.1145/303976.303983 . · doi ↗
- 3[3] Jan Chomicki and Jerzy Marcinkowski. Minimal-change integrity maintenance using tuple deletions. Inf. Comput. , 197(1-2):90–121, 2005. URL: https://doi.org/10.1016/j.ic.2004.04.007 , doi:10.1016/j.ic.2004.04.007 . · doi ↗
- 4[4] Jukka Corander, Antti Hyttinen, Juha Kontinen, Johan Pensar, and Jouko Väänänen. A logical approach to context-specific independence. In Logic, Language, Information, and Computation - 23rd International Workshop, Wo LLIC 2016, Puebla, Mexico, August 16-19th, 2016. Proceedings , pages 165–182, 2016. URL: https://doi.org/10.1007/978-3-662-52921-8_11 , doi:10.1007/978-3-662-52921-8_11 . · doi ↗
- 5[5] Arnaud Durand, Miika Hannula, Juha Kontinen, Arne Meier, and Jonni Virtema. Approximation and dependence via multiteam semantics. Annals of Mathematics and Artificial Intelligence , Jan 2018. URL: https://doi.org/10.1007/s 10472-017-9568-4 , doi:10.1007/s 10472-017-9568-4 . · doi ↗
- 6[6] Arnaud Durand, Juha Kontinen, Nicolas de Rugy-Altherre, and Jouko Väänänen. Tractability frontier of data complexity in team semantics. In Proceedings Sixth International Symposium on Games, Automata, Logics and Formal Verification, Gand ALF 2015, Genoa, Italy, 21-22nd September 2015. , pages 73–85, 2015. URL: https://doi.org/10.4204/EPTCS.193.6 , doi:10.4204/EPTCS.193.6 . · doi ↗
- 7[7] Johannes Ebbing, Lauri Hella, Arne Meier, Julian-Steffen Müller, Jonni Virtema, and Heribert Vollmer. Extended modal dependence logic. In Logic, Language, Information, and Computation - 20th International Workshop, Wo LLIC 2013, Darmstadt, Germany, August 20-23, 2013. Proceedings , pages 126–137, 2013. URL: http://dx.doi.org/10.1007/978-3-642-39992-3_13 , doi:10.1007/978-3-642-39992-3_13 . · doi ↗
- 8[8] Johannes Ebbing, Juha Kontinen, Julian-Steffen Müller, and Heribert Vollmer. A fragment of dependence logic capturing polynomial time. Logical Methods in Computer Science , 10(3), 2014. URL: http://dx.doi.org/10.2168/LMCS-10(3:3)2014 , doi:10.2168/LMCS-10(3:3)2014 . · doi ↗
