Complexity Thresholds in Inclusion Logic

Miika Hannula; Lauri Hella

arXiv:1903.10706·cs.LO·March 27, 2019

Complexity Thresholds in Inclusion Logic

Miika Hannula, Lauri Hella

PDF

TL;DR

This paper explores the computational complexity of inclusion logic, identifying specific fragments that correspond to complexity classes like NL and P, and establishing completeness results for certain problems.

Contribution

It establishes the complexity thresholds of inclusion logic fragments and connects them to well-known complexity classes in ordered models.

Findings

01

Quantifier-free inclusion logic formulas are complete for NL and P.

02

A fragment of inclusion logic captures NL in ordered models.

03

Complexity thresholds are characterized for various syntactical fragments.

Abstract

Logics with team semantics provide alternative means for logical characterization of complexity classes. Both dependence and independence logic are known to capture non-deterministic polynomial time, and the frontiers of tractability in these logics are relatively well understood. Inclusion logic is similar to these team-based logical formalisms with the exception that it corresponds to deterministic polynomial time in ordered models. In this article we examine connections between syntactical fragments of inclusion logic and different complexity classes in terms of two computational problems: maximal subteam membership and the model checking problem for a fixed inclusion logic formula. We show that very simple quantifier-free formulae with one or two inclusion atoms generate instances of these problems that are complete for (non-deterministic) logarithmic space and polynomial time.…

Equations22

ϕ ::= t = t^{'} ∣ \neg t = t^{'} ∣ R \overline{t} ∣ \neg R \overline{t} ∣ ϕ \land ϕ ∣ ϕ \lor ϕ ∣ \exists x ϕ ∣ \forall x ϕ,

ϕ ::= t = t^{'} ∣ \neg t = t^{'} ∣ R \overline{t} ∣ \neg R \overline{t} ∣ ϕ \land ϕ ∣ ϕ \lor ϕ ∣ \exists x ϕ ∣ \forall x ϕ,

A ⊨_{X} ϕ iff A ⊨_{s} ϕ for all s \in X .

A ⊨_{X} ϕ iff A ⊨_{s} ϕ for all s \in X .

If A ⊨_{X} ϕ and Y \subseteq X, then A ⊨_{Y} ϕ .

If A ⊨_{X} ϕ and Y \subseteq X, then A ⊨_{Y} ϕ .

A ⊨_{X} \overline{x} \subseteq \overline{y} if \forall s \in X \exists s^{'} \in X : s (\overline{x}) = s^{'} (\overline{y}) .

A ⊨_{X} \overline{x} \subseteq \overline{y} if \forall s \in X \exists s^{'} \in X : s (\overline{x}) = s^{'} (\overline{y}) .

A ⊨_{X} ϕ ⟺ A ⊨_{X ↾ V} ϕ .

A ⊨_{X} ϕ ⟺ A ⊨_{X ↾ V} ϕ .

\forall X \in X : A ⊨_{X} ϕ ⟹ A ⊨_{⋃ X} ϕ .

\forall X \in X : A ⊨_{X} ϕ ⟹ A ⊨_{⋃ X} ϕ .

R_{ψ, A, s} = {\overline{a} \overline{b} \in M^{2 k} ∣ A ⊨ ψ (\overline{a}, \overline{b}, s (\overline{z}))} .

R_{ψ, A, s} = {\overline{a} \overline{b} \in M^{2 k} ∣ A ⊨ ψ (\overline{a}, \overline{b}, s (\overline{z}))} .

A ⊨_{s} [TC_{\overline{x}, \overline{y}} ψ (\overline{x}, \overline{y}, \overline{z})] (\overline{t}_{0}, \overline{t}_{1}) iff (\overline{t}_{0}^{A, s}, \overline{t}_{1}^{A, s}) \in TC (R_{ψ, A, s}) .

A ⊨_{s} [TC_{\overline{x}, \overline{y}} ψ (\overline{x}, \overline{y}, \overline{z})] (\overline{t}_{0}, \overline{t}_{1}) iff (\overline{t}_{0}^{A, s}, \overline{t}_{1}^{A, s}) \in TC (R_{ψ, A, s}) .

[TC_{\overline{x}, \overline{y}} α (\overline{x}, \overline{y})] (\overline{min}, \overline{max})

[TC_{\overline{x}, \overline{y}} α (\overline{x}, \overline{y})] (\overline{min}, \overline{max})

ϕ ::= α ∣ \overline{x} \subseteq \overline{y} ∣ ϕ \lor ϕ ∣ ϕ \land α ∣ \exists x ϕ,

ϕ ::= α ∣ \overline{x} \subseteq \overline{y} ∣ ϕ \lor ϕ ∣ ϕ \land α ∣ \exists x ϕ,

ϕ^{'} := \exists \overline{x} \overline{y} \overline{t}_{x} \overline{t}_{y} (ψ_{1} \land ψ_{2} \land ψ_{3} \land ψ_{4})

ϕ^{'} := \exists \overline{x} \overline{y} \overline{t}_{x} \overline{t}_{y} (ψ_{1} \land ψ_{2} \land ψ_{3} \land ψ_{4})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Complexity Thresholds in Inclusion Logic

Miika Hannula

University of Helsinki, Finland

Lauri Hella

Tampere University, Finland

Abstract

Logics with team semantics provide alternative means for logical characterization of complexity classes. Both dependence and independence logic are known to capture non-deterministic polynomial time, and the frontiers of tractability in these logics are relatively well understood. Inclusion logic is similar to these team-based logical formalisms with the exception that it corresponds to deterministic polynomial time in ordered models. In this article we examine connections between syntactical fragments of inclusion logic and different complexity classes in terms of two computational problems: maximal subteam membership and the model checking problem for a fixed inclusion logic formula. We show that very simple quantifier-free formulae with one or two inclusion atoms generate instances of these problems that are complete for (non-deterministic) logarithmic space and polynomial time. Furthermore, we present a fragment of inclusion logic that captures non-deterministic logarithmic space in ordered models.

1 Introduction

In this article we study the computational complexity of inclusion logic. Inclusion logic was introduced by Galliani [9] as a variant of dependence logic, developed by Väänänen in 2007 [25]. Dependence logic is a logical formalism that extends first-order logic with novel atomic formulae $\mathrm{dep}\!\left(x_{1},\ldots,x_{n}\right)$ expressing that a variable $x_{n}$ depends on variables $x_{1},\ldots,x_{n-1}$ . One motivation behind dependence logic is to find a unifying logical framework for analyzing dependency notions from different contexts. Since its introduction, versions of dependence logic have been formulated and investigated in a variety of logical environments, including propositional logic [15, 28, 30], modal logic [7, 26], probabilistic logics [5], and two-variable logics [21]. Recent research has also pursued connections and applications of dependence logic to fields such as database theory [13, 14], Bayesian networks [4], and social choice theory [23]. A common notion underlying all these endeavours is that of team semantics. Team semantics, introduced by Hodges in [16], is a semantical framework where formulae are evaluated over multitudes instead of singletons of objects as in classical logics. Depending on the application domain these multitudes may then refer to assignment sets, probability distributions, or database tables, each having their characteristic versions of team semantics [25, 5, 14].

After the introduction of dependence logic Grädel and Väänänen observed that team semantics can be also used to create logics for independence [11]. This was followed by [9] in which Galliani investigated logical languages built upon multiple different dependency notions. Inspired by the inclusion dependencies of database theory, one of the logics introduced was inclusion logic that extends first-order logic with inclusion atoms. Given two sequences of variables $\overline{x}$ and $\overline{y}$ having same length, an inclusion atom $\overline{x}\subseteq\overline{y}$ expresses that the set of values of $\overline{x}$ is included in the set of values of $\overline{y}$ . Inclusion logic was shown to be equi-expressive to positive greatest-fixed point logic in [10]. In contrast to dependence logic which is equivalent to existential second-order logic [25], and thus to non-deterministic polynomial time ( $\mathbf{NP}$ ), this finding established inclusion logic as the first team-based based logic for polynomial time ( $\mathbf{P}$ ). Our focus in this article is to pursue this connection further by investigating the complexity of quantifier-free inclusion logic in terms of two computational problems: maximal subteam membership and model checking problems. In particular, we identify complexity thresholds for these problems in terms of first-order definability, (non-deterministic) logarithmic space, and polynomial time.

The maximal subteam membership problem $\mathsf{MSM}(\phi)$ for a formula $\phi$ asks whether a given assignment is in the maximal subteam of a given team that satisfies $\phi$ . This problem is closely related to the notion of a repair of an inconsistent database [2]. A repair of a database instance $I$ w.r.t. some set $\Sigma$ of constraints is an instance $J$ obtained by deleting and/or adding tuples from/to $I$ such that $J$ satisfies $\Sigma$ , and the difference between $I$ and $J$ is minimal according to some measure. If only deletion of tuples is allowed, $J$ is called a subset repair. It was observed in [3] that if $\Sigma$ consists of inclusion dependencies, then for every $I$ there exists a unique subset repair $J$ of $I$ ; this was later generalized to arbitrary LAV tgds (local-as-view tuple generating dependencies) in [24].

The research on database repair has been mainly focused on two problems: consistent query answering and repair checking. In the former, given a query $Q$ and a database instance $I$ the problem is to compute the set of tuples that belong to $Q(J)$ for every repair $J$ of $I$ . The latter is the decision problem: is $J$ a repair of $I$ for two given database instances $I$ and $J$ . The complexity of these problems for various classes of dependencies and different types of repairs has been extensively studied in the literature; see e.g. [1, 3, 22, 24]. In this setting, the maximal subteam membership problem can be seen as a variant of the repair checking problem: regarding a team as a (unirelational) database instance $I$ and a formula $\phi$ of inclusion logic as a constraint, an assignment is a positive instance of $\mathsf{MSM}(\phi)$ just in case it is in the unique subset repair of $I$ . Note however, that in $\mathsf{MSM}(\phi)$ , the task is essentially to compute the maximal subteam from a given database instance $I$ , instead of just checking that a given $J$ is the unique subset repair of $I$ . Note further, that using a single formula $\phi$ as a constraint is actually more general than using a (finite) set $\Sigma$ of inclusion dependencies. Indeed, as $\phi$ we can take the conjunction of all inclusions in $\Sigma$ . Furthermore, using disjunctions and quantifiers, we can form constraints not expressible in the usual formalism with a set of dependencies.

The complexity of model checking in team semantics has been studied in [6, 20] for dependence and independence logics. For these logics increase in complexity arises particularly from disjunctions. For example, model checking for a disjunction of three (two, resp.) dependence atoms is complete for $\mathbf{NP}$ ( $\mathbf{NL}$ , resp.), while a single dependence atom is first-order definable [20]. The results of this paper, in contrast, demonstrate that the complexity of inclusion logic formulae is particularly sensitive to conjunctions. We show that $\mathsf{MSM}(\phi)$ is complete for non-deterministic logarithmic space if $\phi$ is of the form $x\subseteq y$ or $x\subseteq y\wedge y\subseteq x$ ; for any other conjunction of (non-trivial) unary inclusion atoms $\mathsf{MSM}(\phi)$ is complete for polynomial time. This result gives a complete characterization of the maximal subteam membership problem for conjunctions of unary inclusion atoms. Based on it we also prove complexity results for model checking of quantifier-free inclusion logic formulae. For instance, for any non-trivial quantifier-free $\phi$ in which $x,y,z$ do not occur, model checking of $\phi\vee x\subseteq y$ is $\mathbf{NL}$ -hard, while that of $\phi\vee(x\subseteq z\wedge y\subseteq z)$ is $\mathbf{P}$ -complete.

We also present a safety game for the maximal subteam membership problem. Using this game we examine instances of the maximal subteam membership problem in which the inclusion atoms refer to a key, that is, all inclusion atoms are of the form $x\subseteq y$ where $y$ is a variable which uniquely determines all the remaining variables. We give example formulae for which the thresholds between $\mathbf{NL}$ and $\mathbf{P}$ drop down to $\mathbf{L}$ and $\mathbf{NL}$ under these assumptions.

We conclude the paper by presenting a fragment of inclusion logic that captures $\mathbf{NL}$ . Analogous fragments have previously been established at least for dependence logic. By relating to the Horn fragment of existential second-order logic, Ebbing et al. define a fragment of dependence logic that corresponds to $\mathbf{P}$ [8]. The fragment presented in this paper is constructed by restricting occurrences of inclusion atoms and universal quantifiers, and the correspondence with $\mathbf{NL}$ is shown by using the well-known characterization of $\mathbf{NL}$ in terms of transitive closure logic [18, 19].

2 Preliminaries

We generally use $x,y,z,\ldots$ for variables and $a,b,c,\ldots$ for elements of models. If $\overline{p}$ and $\overline{q}$ are two tuples, we write $\overline{p}\overline{q}$ for the concatenation of $\overline{p}$ and $\overline{q}$ .

Throughout the paper we assume that the reader has a basic familiarity of computational complexity. We use the notation $\mathbf{L}$ , $\mathbf{NL}$ , $\mathbf{P}$ and $\mathbf{NP}$ for the classes consisting of all problems computable in logarithmic space, non-deterministic logarithmic space, polynomial time and non-deterministic polynomial time, respectively.

2.1 Team Semantics

As is customary for logics in the team semantics setting, we assume that all formulae are in negation normal form (NNF). Thus, we give the syntax of first-order logic ( $\mathrm{FO}$ ) as follows:

[TABLE]

where $t$ and $t^{\prime}$ are terms and $R$ is a relation symbol of the underlying vocabulary. For a first-order formula $\phi$ , we denote by ${\mathsf{Fr}(\phi)}$ the set of free variables of $\phi$ , defined in the usual way. The team semantics of $\mathrm{FO}$ is given in terms of the notion of a team. Let $\mathfrak{A}$ be a model with domain $A$ . An assignment $s$ of $A$ is a function from a finite set of variables into $A$ . We write $s(a/x)$ for the assignment that maps all variables according to $s$ , except that it maps $x$ to $a$ . For an assignment $s=\{(x_{i},a_{i})\mid 1\leq i\leq n\}$ , we may use a shorthand $s=(a_{1},\ldots,a_{n})$ if the underlying ordering $(x_{1},\ldots,x_{n})$ of the domain is clear from the context. A team $X$ of $A$ with domain $\mathrm{dom}\!\left(X\right)=\{x_{1},\ldots,x_{n}\}$ is a set of assignments from $\mathrm{dom}\!\left(X\right)$ into $A$ . For $V\subseteq\mathrm{dom}\!\left(X\right)$ , the restriction $X\upharpoonright V$ of a team $X$ is defined as $\{s\upharpoonright V\mid s\in X\}$ . If $X$ is a team, $V\subseteq\mathrm{dom}\!\left(X\right)$ , and $F:X\rightarrow\mathcal{P}(A)\setminus\{\emptyset\}$ , then $X[F/x]$ denotes the team $\{s(a/x)\mid s\in X,a\in F(s)\}$ . For a set $B$ , $X[B/x]$ is the team $\{s(b/x)\mid s\in X,b\in B\}$ . Also, if $s$ is an assignment, then by $\mathfrak{A}\models_{s}\phi$ we refer to Tarski semantics.

Definition 1.

For a model $\mathfrak{A}$ , a team $X$ and a formula in $\mathrm{FO}$ , the satisfaction relation $\mathfrak{A}\models_{X}\phi$ is defined as follows:

•

$\mathfrak{A}\models_{X}\alpha$ if $\forall s\in X:\mathfrak{A}\models_{s}\alpha$ , when $\alpha$ is a literal,

•

$\mathfrak{A}\models_{X}\phi\wedge\psi$ if $\mathfrak{A}\models_{X}\phi\textrm{ and }\mathfrak{A}\models_{X}\psi$ ,

•

$\mathfrak{A}\models_{X}\phi\vee\psi$ if $\mathfrak{A}\models_{Y}\phi\textrm{ and }\mathfrak{A}\models_{Z}\psi$ for some $Y,Z\subseteq X$ such that $Y\cup Z=X$ ,

•

$\mathfrak{A}\models_{X}\exists x\phi$ if $\mathfrak{A}\models_{X[F/x]}\phi$ for some $F:X\rightarrow\mathcal{P}(A)\setminus\{\emptyset\}$ ,

•

$\mathfrak{A}\models_{X}\forall x\phi$ if $\mathfrak{A}\models_{X[A/x]}\phi$ .

If $\mathfrak{A}\models_{X}\phi$ , then we say that $\mathfrak{A}$ and $X$ satisfy $\phi$ . If $\phi$ does not contain any symbols from the underlying vocabulary, in which case satisfaction of a formula does not depend on the model $\mathfrak{A}$ , we say that $X$ satisfies $\phi$ , written $X\models\phi$ , if $\mathfrak{A}\models_{X}\phi$ for all models $\mathfrak{A}$ with a suitable domain (i.e., a domain that includes all the elements appearing in $X$ ). If $\phi$ is a sentence, that is, a formula without any free variables, then we say that $\mathfrak{A}$ satisfies $\phi$ , and write $\mathfrak{A}\models\phi$ , if $\mathfrak{A}\models_{\{\emptyset\}}\phi$ , where $\{\emptyset\}$ is the team that consists of the empty assignment $\emptyset$ .

We say that two sentences $\phi$ and $\psi$ are equivalent, written $\phi\equiv\psi$ , if $\mathfrak{A}\models\phi\iff\mathfrak{A}\models\psi$ for all models $\mathfrak{A}$ . For two logics $\mathcal{L}_{1}$ and $\mathcal{L}_{2}$ , we write $\mathcal{L}_{1}\leq\mathcal{L}_{2}$ if every $\mathcal{L}_{1}$ -sentence is equivalent to some $\mathcal{L}_{2}$ -sentence; the relations “ $\equiv$ ” and “ $<$ ” for $\mathcal{L}_{1}$ and $\mathcal{L}_{2}$ are defined in terms of “ $\leq$ ” in the standard way.

Satisfaction of a first-order formula reduces to Tarski semantics in the following way.

Proposition 2 (Flatness [25]).

For all models $\mathfrak{A}$ , teams $X$ , and formulae $\phi\in\mathrm{FO}$ ,

[TABLE]

A straightforward consequence is that first-order logic is downwards closed.

Corollary 3 (Downward Closure).

For all models $\mathfrak{A}$ , teams $X$ , and formulae $\phi\in\mathrm{FO}$ ,

[TABLE]

2.2 Inclusion Logic

Inclusion logic ( $\mathrm{FO}(\subseteq)$ ) is defined as the extension of $\mathrm{FO}$ by inclusion atoms.

Inclusion atom. Let $\overline{x}$ and $\overline{y}$ be two tuples of variables of the same length. Then $\overline{x}\subseteq\overline{y}$ is an inclusion atom with the satisfaction relation:

[TABLE]

Inclusion logic is local, meaning that satisfaction of a formula depends only on its free variables. Furthermore, the expressive power of inclusion logic is restricted by its union closure property which states that satisfaction of a formula is preserved under taking arbitrary unions of teams.

Proposition 4 (Locality [9]).

Let $\mathfrak{A}$ be a model, $X$ a team, $\phi\in\mathrm{FO}(\subseteq)$ a formula, and $V$ a set of variables such that $\mathrm{Fr}(\phi)\subseteq V\subseteq\mathrm{dom}\!\left(X\right)$ . Then

[TABLE]

Proposition 5 (Union Closure [9]).

Let $\mathfrak{A}$ be a model, $\mathcal{X}$ a set of teams, and $\phi\in\mathrm{FO}(\subseteq)$ a formula. Then

[TABLE]

Note that union closure implies the empty team property, that is, $\mathfrak{A}\models_{\emptyset}\phi$ for all inclusion logic formulae $\phi$ .

The starting point for our investigations is the result by Galliani and Hella [10] characterizing the expressivity of inclusion logic in terms of positive greatest fixed point logic. The latter logic is obtained from greatest fixed-point logic (the dual of least fixed point logic) by restricting to formulae in which fixed point operators occur only positively, that is, within a scope of an even number of negations. In finite models this positive fragment captures the full fixed point logic (with both least and greatest fixed points), and hence it follows from the famous result of Immerman [17] and Vardi [27] that inclusion logic captures polynomial time in finite ordered models.

Theorem 6 ([10]).

Every inclusion logic sentence is equivalent to some positive greatest fixed point logic sentence, and vice versa.

Theorem 7 ([10]).

A class $\mathcal{C}$ of finite ordered models is in $\mathbf{P}$ iff it can be defined in $\mathrm{FO}(\subseteq)$ .

2.3 Transitive Closure Logic

In Section 6 we will explore connections between inclusion logic and transitive closure logic, and hence we next give a short introduction to the latter. A $2k$ -ary relation $R$ is said to be transitive if $(\overline{a},\overline{b})\in R$ and $(\overline{b},\overline{c})\in R$ imply $(\overline{a},\overline{c})\in R$ for $k$ -tuples $\overline{a},\overline{b},\overline{c}$ . The transitive closure of a $2k$ -ary relation $R$ , written $\mathrm{TC}(R)$ , is defined as the intersection of all $2k$ -ary relations $S\supseteq R$ that are transitive. The transitive closure of $R$ can be alternatively defined as $R_{\infty}=\bigcup_{i=0}^{\infty}R_{i}$ for $R_{i}$ defined recursively as follows: $R_{0}=R$ and $R_{i+1}=R\circ R_{i}$ for $i>0$ ; here $A\circ B$ denotes the composition of two relations $A$ and $B$ . Note that $(\overline{a},\overline{b})\in R_{i}$ iff there is an $R$ -path of length $i+1$ from $\overline{a}$ to $\overline{b}$ .

An assignment $s$ , a model $\mathfrak{A}$ , and a formula $\psi(\overline{x},\overline{y},\overline{z})$ , where $\overline{x}$ and $\overline{y}$ are $k$ -ary, give rise to a $2k$ -ary relation defined as follows:

[TABLE]

We can now define transitive closure logic. Given a term $t$ , a model $\mathfrak{A}$ , and an assignment $s$ , we write $t^{\mathfrak{A},s}$ for the interpretation of $t$ under $\mathfrak{A},s$ , defined in the usual way.

Definition 8 (Transitive Closure Logic).

Transitive closure logic ( $\mathrm{TC}$ ) is obtained by extending first-order logic with transitive closure formulae $[\mathrm{TC}_{\overline{x},\overline{y}}\psi(\overline{x},\overline{y},\overline{z})](\overline{t}_{0},\overline{t}_{1})$ where $\overline{t}_{0}$ and $\overline{t}_{1}$ are $k$ -tuples of terms, and $\psi(\overline{x},\overline{y},\overline{z})$ is a formula where $\overline{x}$ and $\overline{y}$ are $k$ -tuples of variables. The semantics of the transitive closure formula is defined as follows:

[TABLE]

Thus, $[\mathrm{TC}_{\overline{x},\overline{y}}\psi(\overline{x},\overline{y},\overline{z})](\overline{t}_{0},\overline{t}_{1})$ is true if and only if there is a $\psi$ -path from $\overline{t}_{0}$ to $\overline{t}_{1}$ . It is well known that transitive closure logic captures non-deterministic logarithmic space in finite ordered models. In particular, this can be achieved by using only one application of the $\mathrm{TC}$ operator. We use below the notation $\min$ for the least element of the linear order, and $\overline{\min}$ for the tuple $(\min,\ldots,\min)$ . Similarly, $\overline{\max}$ denotes the tuple $(\max,\ldots,\max)$ , where $\max$ is the greatest element.

Theorem 9 ([18, 19]).

A class $\mathcal{C}$ of finite ordered models is in $\mathbf{NL}$ iff it can be defined in $\mathrm{TC}$ . Furthermore, every $\mathrm{TC}$ -sentence is equivalent in finite ordered models to a sentence of the form

[TABLE]

where $\alpha$ is first-order.

3 Maximal Subteam Membership

In this section we define the maximal subteam membership problem and discuss some of its basic properties. We also define a safety game for quantifier-free inclusion logic formulae. This game will be used later to facilitate some proofs regarding the complexity of the maximal subteam membership problem.

3.1 Introduction

For a model $\mathfrak{A}$ , a team $X$ , and an inclusion logic formula $\phi$ , we define $\nu(\mathfrak{A},X,\phi)$ as the unique subteam $Y\subseteq X$ such that $\mathfrak{A}\models_{Y}\phi$ , and $\mathfrak{A}\not\models_{Z}\phi$ if $Y\subsetneq Z\subseteq X$ . Due to the union closure property $\nu(\mathfrak{A},X,\phi)$ always exists and it can be alternatively defined as the union of all subteams $Y\subseteq X$ such that $\mathfrak{A}\models_{Y}\phi$ . If $\phi$ does not contain any symbols from the underlying vocabulary, then we may write $\nu(X,\phi)$ instead of $\nu(\mathfrak{A},X,\phi)$ . The maximal subteam membership problem is now given as follows.

Definition 10.

Let $\phi\in\mathrm{FO}(\subseteq)$ . Then $\mathsf{MSM}(\phi)$ is the problem of determining whether $s\in\nu(\mathfrak{A},X,\phi)$ for a given model $\mathfrak{A}$ , a team $X$ and an assignment $s\in X$ .

Grädel proved that for any $\mathrm{FO}(\subseteq)$ -formula $\phi$ , there is a formula $\psi$ of positive greatest fixed point logic such that for any model $\mathfrak{A}$ and assignment $s$ , $\mathfrak{A}\models_{s}\psi$ if and only if $s$ is in the maximal team of $\mathfrak{A}$ satisfying $\phi$ (see Theorem 24 in [12]). An easy adaptation of the proof shows that $\nu(\mathfrak{A},X,\phi)$ is also definable in positive greatest fixed point logic. Thus, it follows that every maximal subteam membership problem is polynomial time computable.

Lemma 11.

For every formula $\phi\in\mathrm{FO}(\subseteq)$ , $\mathsf{MSM}(\phi)$ is in $\mathbf{P}$ .

In this section we will restrict our attention to maximal subteam problems for quantifier free formulae. Before proceeding to our findings we need to present some auxiliary concepts and results. The following lemmata will be useful below.

Lemma 12.

Let $\alpha,\beta\in\mathrm{FO}(\subseteq)$ , and let $X$ be a team of a model $\mathfrak{A}$ . Then $\nu(\mathfrak{A},X,\alpha\vee\beta)=\nu(\mathfrak{A},X,\alpha)\cup\nu(\mathfrak{A},X,\beta)$ .

Proof.

For “ $\subseteq$ ”, note that by definition there are $Y,Z\subseteq X$ such that $Y\cup Z=\nu(\mathfrak{A},X,\alpha\vee\beta)$ , $Y\models\alpha$ and $Z\models\beta$ , and hence $Y\subseteq\nu(\mathfrak{A},X,\alpha)$ and $Z\subseteq\nu(\mathfrak{A},X,\beta)$ . For “ $\supseteq$ ”, note that $\nu(\mathfrak{A},X,\alpha)\cup\nu(\mathfrak{A},X,\beta)$ satisifes $\alpha\vee\beta$ so it must be contained by $\nu(\mathfrak{A},X,\alpha\vee\beta)$ . ∎

As an easy corollary we obtain the following lemma.

Lemma 13.

Let $\alpha,\beta\in\mathrm{FO}(\subseteq)$ , and assume that $\mathsf{MSM}(\alpha)$ and $\mathsf{MSM}(\beta)$ both belong to a complexity class $C\in\{\mathbf{L},\mathbf{NL}\}$ . Then $\mathsf{MSM}(\alpha\vee\beta)$ is in $C$ .

The maximal subteam problem for a single inclusion atom $\overline{x}\subseteq\overline{y}$ can be naturally represented using directed graphs. In this representation each assignment forms a vertex, and an assignment $s$ has an outgoing edge to another assignment $s^{\prime}$ if $s(\overline{x})=s^{\prime}(\overline{y})$ . Over finite teams an assignment then belongs to the maximal subteam for $\overline{x}\subseteq\overline{y}$ if and only if it is connected to a cycle.111We are grateful to Phokion Kolaitis, who pointed out this fact to the second author in a private discussion 2016.

Lemma 14.

Let $\mathfrak{A}$ be a model, $X$ a finite team, $\overline{x}$ and $\overline{y}$ two tuples of the same length from $\mathrm{dom}\!\left(X\right)$ , $s$ an assignment of $X$ , and $\alpha$ a first-order formula. Let $G=(X,E)$ be a directed graph where $(s,s^{\prime})\in E$ iff $s(\overline{x})=s^{\prime}(\overline{y})$ and $\mathfrak{A}\models_{\{s,s^{\prime}\}}\alpha$ . Then

(a)

$s\in\nu(\mathfrak{A},X,\overline{x}\subseteq\overline{y}\wedge\alpha)\iff G\text{ contains a path from$ s $to a cycle }$ , 2. (b)

$s\in\nu(\mathfrak{A},X,\overline{x}\subseteq\overline{y}\wedge\overline{y}\subseteq\overline{x}\wedge\alpha)\iff G\text{ contains a path from one cycle to another via }s$

Proof.

Assume for the first statement that $s\in\nu(\mathfrak{A},X,\overline{x}\subseteq\overline{y}\wedge\alpha)$ . Then there is a subteam $Y\subseteq X$ such that $s\in Y$ and $\mathfrak{A}\models_{Y}\overline{x}\subseteq\overline{y}\wedge\alpha$ . Thus for each $s^{\prime}\in Y$ there exists $s^{\prime\prime}\in Y$ such that $s^{\prime\prime}=s^{\prime}(\overline{x})$ . Moreover, $\mathfrak{A}\models_{\{s^{\prime},s^{\prime\prime}\}}\alpha$ , whence $(s^{\prime},s^{\prime\prime})\in E$ . In particular there is a non-ending path in $G$ starting from $s$ . Since $X$ is finite, this path necessarily ends in a cycle. Conversely, assume $G$ contains a path from $s$ to a cycle. Then $\mathfrak{A}\models_{Y}\overline{x}\subseteq\overline{y}\wedge\alpha$ where $Y$ consists of all assignments in the path and cycle. Hence, $s\in\nu(\mathfrak{A},X,\overline{x}\subseteq\overline{y}\wedge\alpha)$ .

For second statement note that, by the argument above, $s\in\nu(\mathfrak{A},X,\overline{y}\subseteq\overline{x}\wedge\alpha)$ if and only if $G^{\prime}=(X,E^{-1})$ contains a path from $s$ to a cycle. But clearly an $E^{-1}$ -path from $s$ to an $E^{-1}$ -cycle is an $E$ -path from an $E$ -cycle to $s$ . ∎

3.2 Safety Game

In this section we present a version of a safety game for the maximal subteam problem of inclusion logic. Our presentation is also related to the safety games for inclusion logic examined in [12]. We present a safety game for a quadruple $(\mathfrak{A},X,s,\phi)$ , written $\mathsf{safety}(\mathfrak{A},X,s,\phi)$ , where $s$ is an assignment of a team $X$ , and $\phi$ is a quantifier-free formula. The main result of the section shows that the maximal subteam problem $\mathsf{MSM}(\phi)$ over $X$ and $s$ can be characterized in terms of this game.

We assume that the reader is familiar with basic terminology on trees. We associate each quantifier-free $\phi\in\mathrm{FO}(\subseteq)$ with a labeled rooted tree $T_{\phi}$ such that the root of the tree is labeled by $\phi$ and each node labeled by $\psi_{0}\vee\psi_{1}$ or $\psi_{0}\wedge\psi_{1}$ has two children labeled by $\psi_{0}$ and $\psi_{1}$ . Notice that two different nodes may have the same label. The safety game for $(\mathfrak{A},X,s,\phi)$ can now be interpreted as a pebble game where assignments of a team $X$ form a stack of pebbles of which one at a time is placed on a node of $T_{\phi}$ . Legal moves of the game then consist of moving the pebble up and down through the tree, removing the pebble from a leaf, and placing a new pebble on a leaf. The starting position is to have $s$ placed on the root, and the winning condition for Player I is to arrive at a position where the game terminates. If no such position is ever reached, Player II wins.

Definition 15 (Safety Game).

Let $\phi\in\mathrm{FO}(\subseteq)$ be quantifier-free, and let $s_{0}$ be an assignment in a team $X$ of a model $\mathfrak{A}$ . The safety game $\mathsf{safety}(\mathfrak{A},X,s_{0},\phi)$ has two players I and II, and the game moves consist of positions $(s,n)$ and $(n,s)$ where $s\in X$ and $n$ is a node of $T_{\phi}$ . The game starts with the position $(s_{0},r)$ , where $r$ is the root, and given a position $(s,n)$ , the game proceeds as follows:

(i)

If $n$ is labeled by a conjunction, then Player I selects a position $(s,n^{\prime})$ where $n^{\prime}$ is a child of $n$ . 2. (ii)

If $n$ is labeled by a disjunction, then Player II selects a position $(s,n^{\prime})$ where $n^{\prime}$ is a child of $n$ . 3. (iii)

If $n$ is labeled by a literal $\psi$ , then the game ends if $\mathfrak{A}\not\models_{s}\psi$ . Otherwise, Player I selects a position $(s,n^{\prime})$ such that $n$ is a descendant of $n^{\prime}$ . 4. (iv)

If $n$ is labeled by $\overline{x}\subseteq\overline{y}$ , then the game ends if there is no $s^{\prime}\in X$ such that $s(\overline{x})=s^{\prime}(\overline{y})$ . Otherwise, Player I either

•

selects a position $(s,n^{\prime})$ such that $n$ is a descendant of $n^{\prime}$ , or

•

selects the position $(n,s)$ .

Given a position $(n,s)$ , the game proceeds as follows:

(v)

Player II selects a position $(s^{\prime},n)$ such that $s(\overline{x})=s^{\prime}(\overline{y})$ .

Player I wins if the game ends after a finite number of moves by the players. Otherwise, Player II wins.

A strategy for a Player is a mapping $\pi$ on positions such that

•

$\pi((s,n))\in\{(s,n^{\prime})\mid n^{\prime}\text{ is a child of }n\}$ , for a non-leaf $n$ ,

•

$\pi((s,n))\in\{(s,n^{\prime})\mid n\text{ is a descendant of }n^{\prime}\}$ , for a leaf $n$ labeled by a literal,

•

$\pi((s,n))\in\{(s,n^{\prime})\mid n\text{ is a descendant of }n^{\prime}\}\cup\{(n,s)\}$ , for a leaf $n$ labeled by $\overline{x}\subseteq\overline{y}$ .

•

$\pi((n,s))\in\{(s^{\prime},n)\mid s^{\prime}\in X,s(\overline{x})=s^{\prime}(\overline{y})\}$ , for a leaf $n$ labeled by $\overline{x}\subseteq\overline{y}$ .

Player $A\in\{\text{I,II}\}$ has a winning strategy for $\mathsf{safety}(\mathfrak{A},X,s_{0},\phi)$ if there is a strategy $\pi_{A}$ such that $A$ wins every game that she plays according to $\pi_{A}$ . That is, $A$ wins any game where she selects the position $\pi_{A}(p)$ on her moves on $p$ .

Note that if $\phi$ does not contain any symbols from the underlying vocabulary, the outcome of $\mathsf{safety}(\mathfrak{A},X,s,\phi)$ is independent of $\mathfrak{A}$ , and thus we write $\mathsf{safety}(X,s,\phi)$ instead.

Next we show that the safety game above gives rise to a characterization of the maximal subteam problem.

Theorem 16.

Let $\phi\in\mathrm{FO}(\subseteq)$ be quantifier-free, and let $s$ be an assignment in a team $X$ of a model $\mathfrak{A}$ . Then $s\in\nu(\mathfrak{A},X,\phi)$ iff Player II has a winning strategy in $\mathsf{safety}(\mathfrak{A},X,s,\phi)$ .

Proof.

For the “only-if” direction, we define top-down recursively for each node $n\in T_{\phi}$ a team $X_{n}$ such that

•

$X_{r}:=\nu(\mathfrak{A},X,\phi)$ for the root $r$ ,

•

$X_{n}:=\nu(\mathfrak{A},X_{n^{\prime}},\psi)$ , for a child $n$ of a node $n^{\prime}$ where $n$ is labeled by $\psi$ .

It follows that $X_{n}\models\psi$ for $n$ with label $\psi$ ; $X_{n}=X_{n_{0}}=X_{n_{1}}$ for conjunction-labeled $n$ with children $n_{0},n_{1}$ ; and $X_{n}=X_{n_{0}}\cup X_{n_{1}}$ for disjunction-labeled $n$ with children $n_{0},n_{1}$ . The strategy of Player II is now the following. If $n$ is labeled by disjunction, then $(s,n)$ is mapped to some $(s,n_{i})$ where $n_{i}$ is a child of $n$ such that $s\in X_{n_{i}}$ , and if $n$ is labeled by $\overline{x}\subseteq\overline{y}$ , then $(n,s)$ is mapped to some $(s^{\prime},n)$ such that $s(\overline{x})=s^{\prime}(\overline{y})$ and $s^{\prime}\in X_{n}$ . We leave it to the reader to check that this strategy is well-defined and winning.

For the “if” direction, assume Player II has a winning strategy $\pi$ . For a node $n$ of $T_{\phi}$ , we let $X_{n}$ be the set of all assignments $s\in X$ for which there exists a game where Player II plays according to her winning strategy and position $(s,n)$ is played at some point of the game. Consider any assignment $s$ from $X_{n}$ for a node $n$ labeled by $\overline{x}\subseteq\overline{y}$ . This means there is a game where position $(s,n)$ is played, and thus also a game where $(n,s)$ , and furthermore $\pi((n,s))=(n,s^{\prime})$ is played. Consequently, an assignment $s^{\prime}\in X_{n}$ exists such that $s(\overline{x})=s^{\prime}(\overline{y})$ . By analogous reasoning we obtain that $X_{n}\models\psi$ for all other types of nodes $n$ with label $\psi$ , too. Furthermore, $X_{n}=X_{n_{0}}=X_{n_{1}}$ for conjunction-labeled $n$ with children $n_{0},n_{1}$ , and $X_{n}=X_{n_{0}}\cup X_{n_{1}}$ for disjunction-labeled $n$ with children $n_{0},n_{1}$ . In particular, $X_{r}\models\phi$ and $s\in X_{r}$ , and hence $s\in\nu(X,\phi)$ . ∎

Given that $X$ is finite, it makes sense to consider bounded length restrictions of the safety game. We let $\mathsf{safety}_{k}(\mathfrak{A},X,s,\phi)$ denote the version of $\mathsf{safety}(\mathfrak{A},X,s,\phi)$ in which, starting position $(s,r)$ excluded, positions of the form $(s,n)$ , i.e., pairs whose left element is an assignment and right element a node, are played at most $k$ times. Player I wins $\mathsf{safety}_{k}(\mathfrak{A},X,s,\phi)$ if the game terminates before such assignment-node pairs have been played $k$ times. Otherwise, if exactly $k$ plays of such nodes appear, Player II wins. The next lemma will be useful later.

Lemma 17.

Let $\phi\in\mathrm{FO}(\subseteq)$ be quantifier-free and such that $T_{\phi}$ has $k$ nodes, and let $s$ be an assignment of a team $X$ that is of size $l$ . Then Player II has a winning strategy for $\mathsf{safety}(\mathfrak{A},X,s,\phi)$ iff she has a winning strategy for $\mathsf{safety}_{k\cdot l}(\mathfrak{A},X,s,\phi)$ .

Proof.

By the end of $\mathsf{safety}_{k\cdot l}(\mathfrak{A},X,s,\phi)$ , positions of the form $(s,n)$ , the root position included, have occurred $k\cdot l+1$ many times, i.e., some position $(s,n)$ has occurred twice. Every time such a repetition is encountered, we may assume that we continue the game from the first occurrence of $(s,n)$ . Since the strategy of Player II is safe for $k\cdot l$ assignment-node moves, we conclude that $\mathsf{safety}(\mathfrak{A},X,s,\phi)$ never terminates. Hence, Player II wins. ∎

4 Complexity of Maximal Subteam Membership

Next we examine the computational complexity of maximal subteam membership. First in Section 4.1 we investigate this problem over arbitrary teams, and then in Section 4.1 we restrict attention to inputs in which the inclusion atoms refer to a key. In Section LABEL:sect:CQA we discuss the implications of our results to consistent query answering.

4.1 Arbitrary Teams

We give a complete characterization of the maximal subteam problem for arbitrary conjunctions of unary inclusion atoms. A unary inclusion atom is an atom of the form $x\subseteq y$ where $x$ and $y$ are single variables. The characterization is given in terms of inclusion graphs.

Definition 18.

Let $\Sigma$ be a set of unary inclusion atoms over variables in $V$ . Then the inclusion graph of $\Sigma$ is defined as $G_{\Sigma}=(V,E)$ such that $(x,y)\in E$ iff $x\neq y$ and $x\subseteq y$ appears in $\Sigma$ .

We will now prove the following theorem.

Theorem 19.

Let $\Sigma$ be a finite set of unary inclusion atoms, and let $\phi$ be the conjunction of all atoms in $\Sigma$ . Then $\mathsf{MSM}(\phi)$ is

(a)

trivially true if $G_{\Sigma}$ has no edges, 2. (b)

$\mathbf{NL}$ -complete if $G_{\Sigma}$ has an edge $(x,y)$ and no other edges except possibly for its inverse $(y,x)$ , 3. (c)

$\mathbf{P}$ -complete otherwise.

The first statement above follows from the observation that $\mathsf{MSM}(\phi)$ is true for all inputs if $\phi$ is a conjunction of trivial inclusion atoms $x\subseteq x$ . The second statement is shown by relating to graph reachability. Given a directed graph $G=(V,E)$ and two vertices $a$ and $b$ , the problem REACH is to determine whether $G$ contains a path from $a$ to $b$ . This problem is a well-known complete problem for $\mathbf{NL}$ , and it will also be applied later in Section LABEL:sect:key where the complexity of $\mathsf{MSM}(x\subseteq y\wedge u\subseteq v)$ over teams with keys $y$ and $v$ is examined.

Lemma 20.

$\mathsf{MSM}(x\subseteq y)$ * and $\mathsf{MSM}(x\subseteq y\wedge y\subseteq x)$ are $\mathbf{NL}$ -complete. *

Proof.

Hardness. We give a logarithmic space many-one reduction from REACH. Let $G=(V,E)$ be a graph, and let $a,b\in V$ . W.l.o.g. we can assume $G$ has no cycles. Define $E^{\prime}$ as the extension of $E$ with an extra edge $(b,a)$ . Then $b$ is reachable from $a$ in $G$ if and only if $a$ belongs to a cycle in $G^{\prime}=(V,E^{\prime})$ . We reduce from $(G,a,b)$ to a team $X=\{s_{d,c}\mid(c,d)\in E^{\prime}\}$ where $s_{u,v}$ maps $(x,y)$ to $(u,v)$ (see Fig. 4.1). By Lemma 14, $b$ is reachable from $a$ if and only if $s_{a,b}\in\nu(X,\phi)$ , where $\phi$ is either $x\subseteq y$ or $x\subseteq y\wedge y\subseteq x$ .

Membership. By Lemma 14 $\mathsf{MSM}(x\subseteq y)$ and $\mathsf{MSM}(x\subseteq y\wedge y\subseteq x)$ reduce to reachability variants that are clearly in $\mathbf{NL}$ . ∎

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Foto N. Afrati and Phokion G. Kolaitis. Repair checking in inconsistent databases: algorithms and complexity. In Database Theory - ICDT 2009, 12th International Conference, St. Petersburg, Russia, March 23-25, 2009, Proceedings , pages 31–41, 2009. URL: http://doi.acm.org/10.1145/1514894.1514899 , doi:10.1145/1514894.1514899 . · doi ↗
2[2] Marcelo Arenas, Leopoldo E. Bertossi, and Jan Chomicki. Consistent query answers in inconsistent databases. In Proceedings of the Eighteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 31 - June 2, 1999, Philadelphia, Pennsylvania, USA , pages 68–79, 1999. URL: http://doi.acm.org/10.1145/303976.303983 , doi:10.1145/303976.303983 . · doi ↗
3[3] Jan Chomicki and Jerzy Marcinkowski. Minimal-change integrity maintenance using tuple deletions. Inf. Comput. , 197(1-2):90–121, 2005. URL: https://doi.org/10.1016/j.ic.2004.04.007 , doi:10.1016/j.ic.2004.04.007 . · doi ↗
4[4] Jukka Corander, Antti Hyttinen, Juha Kontinen, Johan Pensar, and Jouko Väänänen. A logical approach to context-specific independence. In Logic, Language, Information, and Computation - 23rd International Workshop, Wo LLIC 2016, Puebla, Mexico, August 16-19th, 2016. Proceedings , pages 165–182, 2016. URL: https://doi.org/10.1007/978-3-662-52921-8_11 , doi:10.1007/978-3-662-52921-8_11 . · doi ↗
5[5] Arnaud Durand, Miika Hannula, Juha Kontinen, Arne Meier, and Jonni Virtema. Approximation and dependence via multiteam semantics. Annals of Mathematics and Artificial Intelligence , Jan 2018. URL: https://doi.org/10.1007/s 10472-017-9568-4 , doi:10.1007/s 10472-017-9568-4 . · doi ↗
6[6] Arnaud Durand, Juha Kontinen, Nicolas de Rugy-Altherre, and Jouko Väänänen. Tractability frontier of data complexity in team semantics. In Proceedings Sixth International Symposium on Games, Automata, Logics and Formal Verification, Gand ALF 2015, Genoa, Italy, 21-22nd September 2015. , pages 73–85, 2015. URL: https://doi.org/10.4204/EPTCS.193.6 , doi:10.4204/EPTCS.193.6 . · doi ↗
7[7] Johannes Ebbing, Lauri Hella, Arne Meier, Julian-Steffen Müller, Jonni Virtema, and Heribert Vollmer. Extended modal dependence logic. In Logic, Language, Information, and Computation - 20th International Workshop, Wo LLIC 2013, Darmstadt, Germany, August 20-23, 2013. Proceedings , pages 126–137, 2013. URL: http://dx.doi.org/10.1007/978-3-642-39992-3_13 , doi:10.1007/978-3-642-39992-3_13 . · doi ↗
8[8] Johannes Ebbing, Juha Kontinen, Julian-Steffen Müller, and Heribert Vollmer. A fragment of dependence logic capturing polynomial time. Logical Methods in Computer Science , 10(3), 2014. URL: http://dx.doi.org/10.2168/LMCS-10(3:3)2014 , doi:10.2168/LMCS-10(3:3)2014 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Complexity Thresholds in Inclusion Logic

Abstract

1 Introduction

2 Preliminaries

2.1 Team Semantics

Definition 1**.**

Proposition 2** (Flatness [25]).**

Corollary 3** (Downward Closure).**

2.2 Inclusion Logic

Proposition 4** (Locality [9]).**

Proposition 5** (Union Closure [9]).**

Theorem 6** ([10]).**

Theorem 7** ([10]).**

2.3 Transitive Closure Logic

Definition 8** (Transitive Closure Logic).**

Theorem 9** ([18, 19]).**

3 Maximal Subteam Membership

3.1 Introduction

Definition 10**.**

Lemma 11**.**

Lemma 12**.**

Proof.

Lemma 13**.**

Lemma 14**.**

Proof.

3.2 Safety Game

Definition 15** (Safety Game).**

Theorem 16**.**

Proof.

Lemma 17**.**

Proof.

4 Complexity of Maximal Subteam Membership

4.1 Arbitrary Teams

Definition 18**.**

Theorem 19**.**

Lemma 20**.**

Proof.

Definition 1.

Proposition 2 (Flatness [25]).

Corollary 3 (Downward Closure).

Proposition 4 (Locality [9]).

Proposition 5 (Union Closure [9]).

Theorem 6 ([10]).

Theorem 7 ([10]).

Definition 8 (Transitive Closure Logic).

Theorem 9 ([18, 19]).

Definition 10.

Lemma 11.

Lemma 12.

Lemma 13.

Lemma 14.

Definition 15 (Safety Game).

Theorem 16.

Lemma 17.

Definition 18.

Theorem 19.

Lemma 20.