Data Complexity and Rewritability of Ontology-Mediated Queries in Metric Temporal Logic under the Event-Based Semantics (Full Version)
Vladislav Ryzhikov, Przemyslaw Andrzej Walega, Michael Zakharyaschev

TL;DR
This paper explores the data complexity of answering ontology-mediated queries in metric temporal logic under event-based semantics, providing complexity classifications, rewritings, and lower bounds.
Contribution
It introduces complexity classifications, rewritings to first-order logic, and establishes lower bounds for ontology-mediated query answering in metric temporal logic.
Findings
Queries can be answered in AC0, NC1, L, NL, P, and coNP depending on the class.
Rewritings to first-order logic and extensions are provided.
Lower bounds for data complexity are established.
Abstract
We investigate the data complexity of answering queries mediated by metric temporal logic ontologies under the event-based semantics assuming that data instances are finite timed words timestamped with binary fractions. We identify classes of ontology-mediated queries answering which can be done in AC0, NC1, L, NL, P, and coNP for data complexity, provide their rewritings to first-order logic and its extensions with primitive recursion, transitive closure or datalog, and establish lower complexity bounds.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Logic, Reasoning, and Knowledge
Data Complexity and Rewritability of Ontology-Mediated Queries in Metric Temporal Logic under the Event-Based Semantics
Vladislav Ryzhikov
Birkbeck, University of London, UK
&Przemyslaw Andrzej Walega
University of Oxford, UK and University of Warsaw, Poland
&Michael Zakharyaschev
Birkbeck, University of London, UK
Abstract
We investigate the data complexity of answering queries mediated by ontologies given in metric temporal logic MTL under the event-based semantics assuming that data instances are finite timed words with binary fractions as timestamps. We identify classes of ontology-mediated queries answering which can be done in , , L, NL, P, and coNP for data complexity, provide their rewritings to first-order logic and its extensions with primitive recursion, transitive closure or datalog, and establish lower complexity bounds.
1 Introduction
In this paper, we are concerned with the following problem: given a formula of metric temporal logic MTL and an atomic proposition , is it possible to construct a query in some standard query language such that, for any data instance of atoms timestamped with binary fractions and any timestamp from , we have iff is true in ?
MTL was originally designed for modelling and reasoning about real-time systems [24, 2]; for a survey see [12]. Recently, combinations of MTL with description logics have been suggested as temporal ontology languages [22, 7]. Datalog with MTL-operators was used by [13, 27] for practical ontology-based access to temporal log data aiming to facilitate detection and monitoring complex events in asynchronous systems based on sensor measurements. For example, a Siemens turbine has a coast down if the rotor speed was below 1500 in the previous 30 seconds, while no more than 2 minutes before that the speed was above 6600 for 30 seconds. The event ‘coast down’ can be encoded by the following MTL-formula, where () is true at a timestamp if holds at some (respectively, all) with :
[TABLE]
To find when a coast down occurred, a Siemens engineer can now simply execute the query mediated by an MTL-ontology with formulas such as the one above, whose atoms are related to sensor data by appropriate mappings. Answering datalogMTL queries in the streaming setting was considered by [32].
The underpinning idea of classical ontology-based data access (OBDA) [15, 33] is a reduction of ontology-mediated query (OMQ) answering to standard database query evaluation. As known from descriptive complexity [23], the existence of such reductions, or rewritings, is closely related to the data complexity of OMQ answering, which is by now well understood for atemporal OMQs both uniformly (for all OMQs in a given language) and non-uniformly (for individual OMQs) [19, 10, 11, 26].
Temporal ontology and query languages have attracted attention of datalog and description logic communities since the 1990s; see [8, 17, 25, 5] for surveys. In recent years, the proliferation of temporal data from various sources and its importance for analysing the behaviour of complex systems and decision making in all economic sectors have intensified research into formalisms that can be used for querying temporal databases and streaming data [31, 9, 30]. OBDA with atemporal ontologies and query languages with linear temporal logic LTL operators has been in use since [6, 29]. Rewritability and data complexity of OMQs in the description logics DL-Lite and extended with LTL operators were considered in [4, 21].
Here, we investigate the (uniform) rewritability and data complexity problems for basic OMQs given in metric temporal logic MTL, assuming that data instances are finite sets of atoms timestamped by dyadic rationals and that MTL is interpreted under the event-based semantics where atoms refer to events (state changes) rather than to states themselves [28]. MTL is more succinct, expressive, and versatile compared to LTL, being able to model both synchronous (discrete) and asynchronous (real-time) settings.
First, we observe that answering arbitrary MTL-OMQs is coNP-complete for data complexity (in contrast to -completeness for LTL-OMQs). OMQs in the Horn fragment hornMTL are P-complete and rewritable to datalog(FO), which extends datalog with FO-formulas built from EDB predicates; in fact, we establish P-hardness already for the fragment of hornMTL with binary rules (like in OWL 2 QL) and box operators only. OMQs in coreMTL turn out to be FO(TC)-rewritable (FO with transitive closure) and NL-hard. We then classify MTL-OMQs by the type of ranges constraining their temporal operators ϱ and : infinite and , punctual , and arbitrary non-punctual . We show that OMQs of the first type are FO-rewritable and can be answered in . OMQs of the second type are FO(RPR)-rewritable (FO with relational primitive recursion) and -complete. For the third type, we obtain an NL upper bound with rewritability to FO(TC) and lower bound; for hornMTL-OMQs of this type, the results are improved to L with rewritability to FO(DTC) (FO with deterministic closure).
2 MTL Ontology-Mediated Queries
In the context of event monitoring, we consider a ‘past’ variant of MTL, which is a propositional modal logic with constrained operators ϱ ‘sometime in the past within range ’ and ‘always in the past within range ,’ interpreted over finite timed words under the event-based semantics. We assume that timestamps in timed words are given as non-negative dyadic rational numbers (finite binary fractions), the set of which is denoted by . The ranges in ϱ and are non-empty intervals with end-points in .
An MTL-program, , is a finite set of rules of the form
[TABLE]
where each takes the form , , or , for an atomic proposition . We denote the empty by (truth) and empty by (falsehood). Using fresh atoms, every MTL-formula can be transformed to an equivalent (in the sense of giving the same answers to queries) MTL-program.
An MTL-program is called a hornMTL-program if, in all of its rules (1), and is an atom. As usual, is called the head of the rule and its body. A hornMTL-program is a coreMTL-program if . An MTL- (hornMTL- or coreMTL-) ontology-mediated query (OMQ) takes the form , where is an MTL- (resp., hornMTL- or coreMTL-) program and an atom.
Intuitively, a data instance, , can be thought of as a word with timestamps , , where each is the set of atoms that are true at . Formally, we represent as the FO-structure
[TABLE]
with domain ordered by , timestamps , , and subsets . The ternary predicates and are such that, for any and , there are unique with and . These predicates give the value of every timestamp : iff and hold for all . We assume that if . For any , we can define an FO-formula that holds in iff and , its variants , , etc.; see Appendix A for details. Using these, we can further define FO-formulas for , for ‘ is an immediate successor of in ’, and FO-expressible constants and .
An event-based interpretation over is a structure
[TABLE]
where the Boolean connectives are interpreted as usual and
[TABLE]
An interpretation over is a model of an MTL-program and if, for any rule (1) in and any , whenever for all , , then for some , . We call and consistent if there is a model of and .
Henceforth, we write for the set of timestamps in (2) and often informally identify with its value . We call (and so ) a certain answer to over if for every model of and . The OMQ answering problem for is to decide, given and , whether is a certain answer to over . To illustrate, consider , and . Then is a certain answer to over , but there are no certain answers to over :
B\ {\color[rgb]{.5,.5,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,.5,.5}\pgfsys@color@gray@stroke{.5}\pgfsys@color@gray@fill{.5}B^{\prime}}[math]B\ {\color[rgb]{.5,.5,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,.5,.5}\pgfsys@color@gray@stroke{.5}\pgfsys@color@gray@fill{.5}B^{\prime}}$$\frac{1}{2}$$C\ {\color[rgb]{.5,.5,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,.5,.5}\pgfsys@color@gray@stroke{.5}\pgfsys@color@gray@fill{.5}A}$$\frac{3}{2}
We are interested in the data complexity of OMQ answering, that is, regard as the only input to the problem and assume to be fixed.
Let be a query language over FO-structures (2). An OMQ is said to be -rewritable if there is an -query , called an -rewriting of , such that, for any data instance , a timestamp is a certain answer to over iff . Our target query languages include:
- –
FO and its extension FO with the predicate PLUS (e.g., says that is odd); evaluating such queries is in for data complexity;
- –
FO(RPR), i.e., FO with relational primitive recursion, which is in [18];
- –
FO(TC) and FO(DTC), i.e., FO with transitive and deterministic transitive closure, which are in NL and L, respectively [23];
- –
datalog(FO), i.e., datalog queries with additional FO-formulas built from EDB predicates in their rule bodies, which are in P [20].
All of them save datalog(FO) can be implemented in SQL. -rewritability of an OMQ means that answering is in the same data-complexity class as evaluation of -queries.
Given a hornMTL-program and a data instance , we define a set of pairs of the form that contains all answers to OMQs with over . We start by setting and denote by the result of applying exhaustively and non-recursively the following rules to :
- –
if is in and , for all , , then we add to ;
- –
if occurs in , , and holds for some , then we add to ;
- –
if occurs in , and for all with , then we add to .
It should be clear that there is some polynomially depending on and such that . We then set . We can regard as a (minimal) model of and with domain in which iff The proof of the following is standard:
Theorem 1**.**
For a hornMTL-OMQ , is inconsistent with iff ; a timestamp is a certain answer to a hornMTL-OMQ over iff either or is inconsistent with .
Note in passing that, as a consequence, we obtain the following reduction of -rewritability of more general hornMTL-OMQs with positive FO-queries (built from atoms, , , , and ) to -rewritability of atomic OMQs we deal with in this paper:
Corollary 2**.**
Let be a hornMTL-OMQ with a positive FO-query . If has an -rewriting , for every atom , then is an -rewriting of , where is an atom not occurring in and any data instance and is the result of replacing every atom of the form in with .
Proof.
Observe first that, since is positive, we have, for any consistent and and any , that iff for all models of and . Indeed, to show , we use the fact that there is a homomorphism from onto (as is a minimal model of and ), and positive formulas are preserved under homomorphic images [16]. On the other hand, one can show by induction on the construction of that iff . It remains to observe that and are inconsistent iff . ∎
3 OMQs with Arbitrary Ranges
We begin by establishing (non-)rewritability and data complexity of answering OMQs in various classes where arbitrary ranges in temporal operators are allowed. We denote by (coreMTL) the restriction of coreMTL to the language with operators (respectively, ϱ) only.
Theorem 3**.**
* Answering MTL-OMQs is coNP-complete for data complexity; hornMTL-OMQs are datalog(FO)-rewritable, with -OMQs being P-hard; coreMTL-OMQs are FO(TC)-rewritable and NL-hard.*
Proof.
The membership in coNP is trivial. We establish coNP-hardness by reduction of NP-complete circuit satisfiability [3]. Let be a Boolean circuit with -many (two-input) AND, OR and (one-input) NOT gates enumerated by consecutive numbers starting from 0 so that if there is an edge from to , then . Take the minimal and a data instance with the facts
- –
, if is a gate and ;
- –
, if is an input gate;
- –
, if is a NOT gate;
- –
, if is an OR gate;
- –
, if is an AND gate;
- –
, if is a NOT gate with input gate ;
- –
and , if is an OR or AND gate with input gates and .
Let be an MTL-program with the following rules:
[TABLE]
Then is satisfiable iff the maximal number in is not a certain answer to over . An example of and an initial part of a model of , is shown below:
23410A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$\frac{0}{8}$$\frac{1}{8}$$\frac{2}{8}$$\frac{3}{8}$$\frac{4}{8}$$\frac{16}{8}$$\frac{17}{8}$$\frac{18}{8}$$\frac{19}{8}$$\frac{20}{8}$$\frac{32}{8}$$\frac{33}{8}$$\frac{34}{8}$$\frac{35}{8}$$\frac{36}{8}$$\frac{48}{8}$$\frac{49}{8}$$\frac{50}{8}$$\frac{51}{8}$$\frac{52}{8}$$X$$X$$I_{1}$$I_{2}$$D$$I_{1}$$I_{2}$$C$$T$$T$$F$$T$$F$$T$$T$$F$$T$$F
We construct a datalog(FO) rewriting of a hornMTL-OMQ . To begin with, we add to the rule for each in . The other rules in are obtained from the rules in by the following transformations. We replace every atom not under the scope of a temporal operator with and every with
[TABLE]
and similarly for other types of ranges in . Intuitively, iff , for each from . We replace every in the body of a rule with
[TABLE]
and similarly for other types of ranges. Finally, we add the following rules to the resulting program:
[TABLE]
Note that the obtained datalog program contains FO-definable EDB predicates such as and in rule bodies. Clearly, is a certain answer to over any given data instance iff is an answer to over .
We show P hardness of -OMQs by reduction of path system accessibility (PSA). Let be a hypergraph with vertices enumerated by consecutive natural numbers starting from 0 so that if is a hyperedge, then . Let be the lexicographical order of hyperedges. Suppose the problem is to check whether a vertex is accessible from a set of vertices , i.e., whether or there are vertices accessible from and is a hyperedge. Let comprise the atoms , for and a vertex , together with
- –
, , , and , for a hyperedge ;
- –
, for and .
For example, for the vertices , hyperedge , , and , looks as follows:
A$$A$$A$$A$$A$$A$$A$$A$$A$$\frac{5}{4}$$A$$A$$A$$\frac{0}{4}$$\frac{1}{4}$$\frac{2}{4}$$\frac{3}{4}$$\frac{8}{4}$$\frac{9}{4}$$\frac{10}{4}$$\frac{16}{4}$$\frac{17}{4}$$\frac{18}{4}$$\frac{19}{4}$$R$$R$$R$$R$$R^{\prime}$$R^{\prime}$$R^{\prime\prime}$$Rby :hyperedge
Let be a program with the rules:
[TABLE]
Then is a certain answer to over iff is accessible from in .
The upper bound can be shown by reduction to FO(TC) via linear datalog(FO). Without loss of generality, we assume that, in the disjointness constraints occurring in the given coreMTL-OMQ , the are atomic. First, we straightforwardly translate with the disjointness constraints removed from to linear datalog(FO). Then, we transform the result into an FO(TC)-query [20]. Now, for every disjointness constraint in , we take the sentence and, finally, form a disjunction of with those sentences, which is obviously an FO(TC)-rewriting of .
We prove NL-hardness by reduction of the reachability problem in acyclic digraphs. Let be such a digraph with vertices enumerated by consecutive natural numbers starting from 0 so that, if there is an edge from to , then . Let be the lexicographical order of edges. Take the minimal for . Suppose we want to check whether a vertex is accessible from . Let consist of the atoms , for and a vertex ; , , for every edge ; , for . An example of and an initial part of is shown below:
••••3=t$$A$$A$$A$$A$$A$$A$$A$$A$$A$$A$$\frac{0}{4}$$\frac{1}{4}$$\frac{2}{4}$$\frac{3}{4}$$\frac{8}{4}$$\frac{10}{4}$$\frac{16}{4}$$\frac{17}{4}$$\frac{18}{4}$$\frac{19}{4}$$R$$R$$R^{\prime}$$R^{\prime\prime}$$R$$G:by :edge
Let be a coreMTL program with the following rules:
[TABLE]
Then is a certain answer to over iff is reachable from in . ∎
To obtain finer complexity results, we classify MTL-OMQs by the type of ranges in their operators ϱ and : infinite, punctual, and non-punctual. Let be one of or , and let be one of or .
4 OMQs with Ranges
First, consider OMQs with ⟨r,∞) and , which resemble LTL-operators ‘sometime’ and ‘always in the past’. Using partially-ordered automata, it was shown in [4] that LTL-OMQs with these operators are FO-rewritable. Although such automata are not applicable now, we establish the same complexity by characterising the structure of models. In the constructions below, it will be convenient to regard as an abbreviation for with Boolean negation and only consider, without loss of generality, OMQs with occurring in .
Theorem 4**.**
MTL*-OMQs with temporal operators of the form ⟨r,∞) and only are -rewritable.*
Proof.
Let be an MTL-OMQ as specified above. A simple literal, , for takes the form or , where is an atom in ; a temporal literal, , for is of the form or provided that or occurs in and is the atom in . Let and be the sets of simple and temporal literals for , respectively. A type for is any maximal set consistent with . The number of different types is .
Given a model of and some with , denote by the type of in . As the ranges in are of the form , the model has the following monotonicity property:
- –
implies for all in ;
- –
implies for all in .
We call in an osteo-type if there is such that , for all . Thus, if in , there is an osteo-type with . All osteo-types in are pairwise distinct, so the number of them does not exceed . Non-osteo-types are called fluff-types. By monotonicity, any fluff-type has the same temporal literals as its nearest osteo-type , for . For example, in the model of the program , , shown below, there are three fluff-types: , , and .
[math]\neg{}_{\varrho}P$$\neg{}_{\varrho}\neg P$$\neg P$$\frac{1}{2}$$\neg{}_{\varrho}P$$\neg{}_{\varrho}\neg P$$P$$\frac{3}{4}$$\neg{}_{\varrho}P$$\neg{}_{\varrho}\neg P$$\neg P$$1$$\neg{}_{\varrho}P$${}_{\varrho}\neg P$$\neg P$$\frac{9}{8}$$\neg{}_{\varrho}P$${}_{\varrho}\neg P$$P$$\frac{5}{4}$$\neg{}_{\varrho}P$${}_{\varrho}\neg P$$P$$\frac{3}{2}$${}_{\varrho}P$${}_{\varrho}\neg P$$\neg Pfluff-types fluff-type
We now define an FO-sentence such that any given data instance is consistent with iff holds in the FO-structure . Let be the set of sequences , , of distinct types for that satisfy the monotonicity property and such that implies for some ; for minimal such , we write . We write if , and , for some . Denote by the set of types for sharing the same temporal literals with and such that, for every , there is with . Finally, for any type , let (which is true at in iff, for every in , whenever then ). Now, we set
[TABLE]
where says that is the nearest predecessor of , which is different from .
Suppose is a model of , and , for , are all the osteo-types in . This -tuple of types is in and the are true in by definition. The also hold for because is the first type in witnessing the relevant . Similarly, does not hold in for . Finally, let be any timestamp in with . By construction, is a fluff-type in and holds in . If , we have and , and so cannot hold in . Thus, .
Conversely, suppose holds in , assigning timestamps to the and associating types with every . Define an interpretation by setting
[TABLE]
for every atom . We prove that is a model of and . As all the are types for , it suffices to show that
[TABLE]
Suppose . If , for some , then , for some , and so and holds in . If , then , and we can use the previous argument as implies .
Conversely, suppose . Then . Consider first the case . Suppose with . Then for some , and so and , whence and . Now, let . Then for some (because of ). Suppose with . If , then for some , and so and , whence . If , then, by the last conjunct of , we have . Finally, if , then , for some , and we are done again.
An FO-rewriting of is the FO formula , where is obtained from by replacing with , which is if and otherwise. Clearly, holds in iff there is a model of and satisfying in . ∎
We also mention in passing one more FO-rewritability result (which does not fit our classification). To formulate it, we require a few definitions.
Normal form. Until the end of this section, we assume that our MTL programs and OMQs are in normal form. Namely, a program is said to be in normal form if its rules have one of the forms:
[TABLE]
where the are from the data alphabet (like EDB predicates in datalog) and do not occur in the rule heads, while the do not occur in data instances, and for any (although there may be ). Every hornMTL program can be transformed to a program in normal form with the same answers. We illustrate this claim by an example.
Example 5**.**
Let , where the are in the data alphaet. By introducing fresh atoms , , we convert to
[TABLE]
To get rid of , we further transform the program to
[TABLE]
Now, in the first rule is not in the scope of ϱ ( can be regarded as a shorthand for ). So we transform the rule using obvious derivations to obtain the following program in normal form:
[TABLE]
A hornMTL query is in normal form if is in normal form and is not in the data alphabet. Clearly, every query can be converted to a one in normal form and having the same answers.
Metric automata for hornMTL. Our technical tool for studying the data complexity of linear hornMTL queries is automata with metric constraints that are defined for programs in normal form. These automata can be viewed as a primitive version of standard timed automata for MTL [1] as we only have one clock , the clock reset happens at every transition, and the clock constraints are of the simple form .
A (nondeterministic) metric automaton is a quadruple , where is a set of states, a tape alphabet, a transition relation, and is a nonempty set of pairs of the form , where , . The transition relation is a set of instructions of the form with , and a range . The automaton takes as input timed words , where the are timestamps with . A run over is a sequence such that , is in and , for .
Let be a linear hornMTL program in normal form. We denote the conjunctions (with data atoms ) that occur in by , possibly with subscripts. Thus, since is linear, rules (4) in are of the form . Let be the set of all such occurring in . We define a metric automaton for as follows. The set of its states comprises the head concept names in , and . The transition relation comprises such that is in and . Finally, is the set of all pairs such that a rule of the form (3) is in .
Example 6**.**
For , the metric automaton is depicted below, where , , , , and .
P_{0}$$P_{1}$$E_{0}\ (1,2)$$E_{2}\ (1,2)$$\emptyset\ (1,3)$$E_{0}\ (1,3)$$E_{1}\ (1,3)$$E_{2}\ (1,3)
We represent any data instance as a timed word . For occurring in , let be the maximal set of from that hold at in , and let \sigma_{\mathcal{D}}=\big{(}(E(t_{1}),t_{1}),\dots,(E(t_{n}),t_{n})\big{)}.
Example 7**.**
A data instance and its representation as are shown below:
[math]Q^{\prime}$$1$$P_{1}^{\prime}$$1.5$$P_{0}^{\prime}$$4$$P_{1}^{\prime}$$4.5$$P_{1}^{\prime}$$5$$Q^{\prime}$$6.5$$\mathcal{D}:$$\emptyset$$E_{1}$$E_{0}$$\emptyset$$E_{2}$$E_{2}$$\emptyset$$\sigma_{\mathcal{D}}:
Theorem 8**.**
For any linear hornMTL OMQ , a timestamp is a certain answer over a data instance iff there exist a subword of with the last timestamp and a run of over that ends with .
Example 9**.**
Let be an OMQ with from Example 6. Then, for from Example 7, we have the run on
[TABLE]
and so is a certain answer to the query over from Example 7.
One could define metric automata as classical timed automata; however, Theorem 8 does not use them in the standard way as it requires runs on subwords. Whether and how such runs can be captured by timed automata remains to be clarified. We now use the obtained automaton characterisation of certain answers for linear queries to give better complexity bounds for the case of restricted temporal ranges than the NL bound of Theorem 3 .
Call an MTL-program range-uniform if all of its temporal operators have the same constraining range.
Theorem 10**.**
Range-uniform coreMTL-OMQs with ranges of the form ⟨0,r⟩ are FO-rewritable.
Proof.
We illustrate the proof by a concrete example. Consider the OMQ with
[TABLE]
For such a the automaton is shown in the picture below on the right-hand side. Using it, we construct the following FO-rewriting of :
[TABLE]
where
- –
;
- –
;
- –
;
- –
;
- –
;
- –
, for .
Intuitively, to derive at , we need a point with in the data and a sequence of points between and without gaps of length . An example of such a data instance is given below.
Note how we maintain the ‘stack of states’ with the elements at its bottom alternating in a cycle between , and . Note also that the states go in decreasing order when we scan the stack from bottom to top. So we use the formulas to express that is inferred at on level of the stack. The formula says that the height of the stack increases by because of a cluster of points within the segment of size ending with . The formulas and express two ways of increasing the height of the stack from 1 to 3. It is to be emphasised that properties of and such as can be expressed by FO-formulas using the predicate or , which gives a binary representation of every object in the domain of an FO-structure [23], whereas FO with only is not enough. For example, is expressed by the formula
[TABLE]
We leave further details to the reader. ∎
5 OMQs with Punctual Ranges
Operators of the form [r,r] resemble the LTL previous time operator . To illustrate an essential difference, consider the program and the data instance below. In LTL, we always derive at
[math]P$$\frac{1}{4}$$P$$\frac{3}{4}$$P$$\frac{7}{8}$$P\ Q$$\frac{7}{4}$$Q$$\frac{15}{8}$$3$${\color[rgb]{.5,.5,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,.5,.5}\pgfsys@color@gray@stroke{.5}\pgfsys@color@gray@fill{.5}P}\ Q$$\frac{13}{4}
if holds at . In our example, at implies at , which together with at imply at , and eventually the latter with at implies at ; independently, at implies at .
Theorem 11**.**
MTL*-OMQs with temporal operators of the form [r,r] and only are FO(RPR)-rewritable; answering such OMQs is -complete for data complexity.*
Proof.
-hardness is proved by reduction of hornMTL-OMQs with rules of the form , answering which is NC1-complete [4].
To show FO(RPR)-rewritability of a given OMQ , we assume w.l.o.g. that does not contain ranges . Let be the set of numbers occurring as endpoints of ranges in . We set , , for , . Thus, in our example above, , , . We define to be the set of simple and temporal literals with atoms from and operators i such that and n occurs in . By a type for we now mean any maximal subset of consistent with . For types , and , we write if
- –
iff , for any ;
- –
iff , for , .
We say that is a run from to on a data instance of the form (2) if , for , and
- –
;
- –
for all ;
- –
and if then , for any ;
- –
and .
Call initial if , for all . The next lemma follows directly from the given definitions:
Lemma 12**.**
* is consistent iff, for every , there exists a run on from some initial to ; A timestamp is not a certain answer to over iff is consistent and there is a run from initial to on and .*
We first show how to express the existence of a run from to specified in by an FO(RPR)-formula over . First, as divisibility of binary integers by a given number is recognisable by a finite automaton, we can define an FO(RPR)-formula that is true iff , for some (see Appendix B). We also have an FO-formula saying that is minimal among with , for some . Let be the set of all types for , and let comprise with , for all . We define as the FO(RPR)-formula
[TABLE]
where , for , is a relation variable and the formula is a disjunction of the three formulas below if and a disjunction of the last two of them if :
[TABLE]
where is the immediate predecessor of in .
To illustrate, in the context of the example above, the formulas say that holds for the types
[TABLE]
Then holds for
[TABLE]
for the same as , for , and so on.
Thus, we obtain the following FO(RPR)-rewriting of
[TABLE]
where checks the consistency condition of Lemma 12 and can be constructed similarly to . ∎
6 OMQs with Non-Punctual Ranges
Unlike the proof of Theorem 11, where the derived facts at were determined by the data at and the derived facts at the nearest with , for non-punctual ranges the derived facts at depend on an unbounded number of timestamps . In the proof of Theorem 13 below, we show that to construct derivations in this case, we can actually keep track of a fixed number (depending only on the given OMQ) of moments where each was derived.
Theorem 13**.**
* MTL-OMQs whose operators ϱ and have non-punctual are FO(TC)-rewritable; answering them is in NL and -hard; hornMTL-OMQs of this kind are FO(DTC)-rewritable; answering them is in L and -hard.*
Proof.
In both cases, -hardness can be established as in the proof of Theorem 11 by encoding with (0,1].
Let be the given OMQ. For with , let and ; if , and are undefined. Let be the set of all with in , for some . For , let () be the intersection (union) of the defined () with in ; if there are no such , and are undefined. To illustrate, consider the hornMTL-program with the rules
[TABLE]
Then , , and , are undefined.
For a data instance , a trace of length for is a sequence of intervals where either (meaning that this interval is undefined) or , , and , assuming that , for any . Thus, for the data instance below,
P$$\frac{1}{2}$$P$$\frac{5}{4}$$Q$$\frac{5}{2}$$\frac{15}{4}$$5$$\frac{25}{4}$$10
is a trace for . Intuitively, such a trace stores the most recent intervals preceding where a simple literal holds at some point, with storing the very first point where the literal holds. A tuple is an extended type for if
- –
is a type for (as in the proof of Theorem 4);
- –
is a trace for of length , where and denote the end-points of these intervals; if one of the intervals is undefined, ;
- –
iff , for some in ,
where is true iff and . In our example, , , and the following triples are extended types for :
[TABLE]
Intuitively, an extended type records the simple and temporal literals that hold at (the type ) and also some history of the validity of (the traces) justifying the presence of in . As follows from Lemma 14 below, to make correct derivations, this history should keep intervals. Note that this bound does not apply if punctual intervals are present in , which explains the increase of complexity in Theorem 3.
Lemma 14**.**
Let be all the timestamps in . Then and are consistent iff there exists a sequence of extended types for , , satisfying the following conditions for :**
- –
;
- –
*if , all in are ; if , then and for ; *
- –
if and , then ; if , and when and otherwise, then if , else .
Proof.
For a model of and , we define with as follows. If there is a minimal with , we set in ; otherwise . Consider a maximal interval , , such that and, for any , there is with and . Suppose there are such intervals. Let . We define by making its last intervals equal to , making its [math]-th interval equal to , and making all the remaining intervals equal to . One can check that are as required.
() Given a sequence , , we construct an interpretation by making true at in iff for each simple literal from . The conditions on these extended types ensure that is a model of and . ∎
We use the characterisation of Lemma 14 to construct an FO(TC)-sentence that is true in iff and are consistent, for any data instance . contains tuples of variables , for , where and for intervals in traces ; is the same as but with primed variables:
[TABLE]
Here, is an FO-formula saying that holds in the first timestamp () of and represents for all by encoding as the empty interval . The formula under the transitive closure TC says that there is an extended type for with the trace given by , that is the immediate successor of in , and there is an extended type for whose trace is given by . We define it as
[TABLE]
with saying that if is an extended type for with given by , then can be the next extended type with given by :
[TABLE]
The formula defines an extended type for in :
[TABLE]
Finally, is if there is and otherwise it is
[TABLE]
saying that the intervals in the initial extended type are set correctly. That is as required follows from Lemma 14.
One can now modify to obtain an FO(TC)-rewriting of . As before, to obtain a rewriting , we need a formula that holds true on iff there exists a model of such that is true at in this model. We define as a conjunction of and
[TABLE]
The negation of is the required rewriting .
It will be convenient to assume a restricted version of our normal form (1), where ϱ operators do not occur with . Every hornMTL program can be converted to this form by replacing by and then expressing, e.g., by a pair of rules . (For each we substitute it by .) Let be a trace for and be the set from such that , for some in . Let be a set of from . We call a type for minimal for with respect to and if every from is in iff is in the closure of and under rules (1). We say that a type is minimal initial with respect to if it is minimal for some (every) with respect to an empty trace with all in are for all .
Lemma 15**.**
Let be the timestamps in . Then and are consistent iff there exists a sequence of extended types for , , satisfying the conditions of Lemma 14 and such that: is minimal initial w.r.t. , is the minimal for w.r.t. and .
Note that each type in the lemma above is uniquely determined by and so is the trace . Then type is uniquely determined by and and so is , etc. Therefore, we can replace in the formula by such that for given values of and , there are unique values of and for which holds. ∎
7 Conclusion
In this paper, we made a first step towards understanding the data complexity of answering queries mediated by ontologies with MTL operators and their rewritability into standard database query languages. By imposing natural restrictions on the ranges constraining the operators ϱ and , and by distinguishing between arbitrary, Horn and core ontologies, we identified classes of MTL-OMQs that are rewritable to FO, FO, FO(RPR), FO(DTC), FO(TC), and datalog(FO). Unrestricted MTL-OMQs were shown to be coNP-hard. The rewritability results look encouraging, though much remains to be done to make our rewritings practical, especially in the presence of more expressive atemporal (description logic or datalog) ontologies.
We can extend our language with constrained operators since . In this case, hornMTL remains P-complete (but coreMTL becomes P-hard) and Theorem 13 holds, too. We believe that our hornMTL can also be extended with in the rule heads (cf. [14]): Theorems 3 and 13 also hold in this case, but so far we have not managed to prove Theorem 13 for such rules. Extending MTL with future-time operators is also interesting, in which case Theorems 3 and 4 remain to hold. Finally, we are looking into MTL-OMQs under the continuous (state-based) semantics, where the techniques developed above do not apply directly.
Acknowledgements This work was supported by EPSRC UK grant EP/S032282, NCN Poland grant 2016/23/N/HS1/ 02168, and the Foundation for Polish Science (FNP). We would like to thank Stanislav Kikot for his help with the proof of Theorem 5.
Appendix A and Related Formulas
We show that for every we can define an FO-formula that holds in iff and . Let and be such that . Then, we define:
[TABLE]
where states that is the -th bit of the integer part of , and states that is the -th bit of the fractional part of . We define and inductively by means of the following FO()-formulas, where is the last (maximal) element of the domain of , , and (which can can be easily defined using ):
[TABLE]
[TABLE]
The formulas for and are defined by modifications of . Using these, we can further define FO-formulas and .
Appendix B Divisibility and
We will show how to define an FO(RPR)-formula that is true in iff and .
Let be an arbitrary FO-structure. First, we will define FO-formulas , , , and such that:
- –
is true in iff when using the column method to subtract from , the -th bit from the fractional part of is borrowed;
- –
is true in iff when using the column method to subtract from , the -th bit from the integral part of is borrowed;
- –
is true in iff the -th bit of the fractional part of is ;
- –
is true in iff the -th bit of the integral part of is .
Let the binary representations of fractional parts of and , be and , respectively. Let indicate whether a bit is borrowed from when subtracting from using the column method. Clearly, and the value of for can be determined as follows:
[TABLE]
Using the equivalence above, we define as follows:
[TABLE]
In what follows we will denote the binary representation of the fractional part of as , which can be defined as follows (for [math] and treated as truth-values):
[TABLE]
Thus, we can define as:
[TABLE]
In a similar way we can define the formulas and which are about the integral parts of and . In particular, we define as:
[TABLE]
where is (a constant for) the last element of . The formula is defined analogously to .
Next, we will make use of the integer divisibility automaton , which is an NFA taking as an input an inverted binary representation of an integer number and reaching the accepting state iff is divisible by . It is known that for any integer we can construct such an automaton. Recall that is the fractional part of , i.e., iff is true in . Analogously, we denote the integral part of by , i.e., iff is true in .
Next, we claim that for any , , and a state of the divisibility automaton , we can construct an FO-formula which is true in iff and either:
- –
, for , and has a run from to on ; or
- –
and has a run from to on
To construct one needs to consider paths of length bounded by in , whose number is finite, and therefore the formula is constructible in FO (we leave details to the reader).
Let be the number of significant bits in the fractional part of the binary representation of (e.g., for ). Then, we can prove the following result:
Lemma 16**.**
Let , , and let be the divisibility automaton for . Then, for any data instance and , the value of is divisible by iff there exists such that is true in , and has a run from to on , where is the binary representation of the integral part of .
Now, let , , and be as stated in the lemma. For every we will introduce an FO(RPR)-formula expressing that is true in iff either:
- –
and there exists such that and ; or
- –
and there exists such that is true in and .
This formula, denoted by , is as follows:
[TABLE]
Finally, we define by means of the following FO(RPR)-formula, where are all states in :
[TABLE]
Intuitively, the formula uses simultaneous recursion to check whether the accepting state is reached on the input .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Rajeev Alur and David L. Dill. A theory of timed automata. Theoretical Computer Science , 126(2):183 – 235, 1994. http://www.sciencedirect.com/science/article/pii/0304397594900108 .
- 2[2] Rajeev Alur and Thomas A. Henzinger. Real-time logics: Complexity and expressiveness. Inf. Comput. , 104(1):35–77, 1993.
- 3[3] S. Arora and B. Barak. Computational Complexity: A Modern Approach . Cambridge University Press, New York, USA, 1st edition, 2009.
- 4[4] Alessandro Artale, Roman Kontchakov, Alisa Kovtunova, Vladislav Ryzhikov, Frank Wolter, and Michael Zakharyaschev. First-order rewritability of temporal ontology-mediated queries. In Proc. of the 24th Int. Joint Conf. on Artificial Intelligence, IJCAI 2015 , pages 2706–2712. IJCAI/AAAI, 2015.
- 5[5] Alessandro Artale, Roman Kontchakov, Alisa Kovtunova, Vladislav Ryzhikov, Frank Wolter, and Michael Zakharyaschev. Ontology-mediated query answering over temporal data: A survey (invited talk). In TIME , volume 90 of LIP Ics , pages 1:1–1:37. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017.
- 6[6] F. Baader, S. Borgwardt, and M. Lippmann. Temporalizing ontology-based data access. In Proc. of the 24th Int. Conf. on Automated Deduction, CADE-24 , volume 7898 of LNCS , pages 330–344. Springer, 2013.
- 7[7] Franz Baader, Stefan Borgwardt, Patrick Koopmann, Ana Ozaki, and Veronika Thost. Metric temporal description logics with interval-rigid names. In Frontiers of Combining Systems - 11th International Symposium, Fro Co S 2017, Brasília, Brazil, September 27-29, 2017, Proceedings , pages 60–76, 2017.
- 8[8] Marianne Baudinet, Jan Chomicki, and Pierre Wolper. Temporal deductive databases. In Temporal Databases , pages 294–320. 1993.
