This paper introduces datalogMTL, a logic framework that enables efficient ontology-based querying of temporal log data, supporting complex temporal concepts with proven computational properties and demonstrated real-world scalability.
Contribution
It presents datalogMTL, a novel extension of Horn Datalog with metric temporal logic, providing a decidable and scalable approach for temporal data access.
Findings
01
datalogMTL is ExpSpace-complete with punctual intervals
02
Nonrecursive datalogMTL is PSpace-complete for combined complexity
03
Experiments show efficient querying on large datasets
Abstract
We propose a novel framework for ontology-based access to temporal log data using a datalog extension datalogMTL of a Horn fragment of the metric temporal logic MTL. We show that datalogMTL is ExpSpace-complete even with punctual intervals, in which case full MTL is known to be undecidable. We also prove that nonrecursive datalogMTL is PSpace-complete for combined complexity and in AC0 for data complexity. We demonstrate by two real-world use cases that nonrecursive datalogMTL programs can express complex temporal concepts from typical user queries and thereby facilitate access to temporal log data. Our experiments with Siemens turbine data and MesoWest weather data show that datalogMTL ontology-mediated queries are efficient and scale on large datasets.
Tables9
Table 1. Table 4: Siemens data for one turbine.
# of months
32
64
96
128
159
191
223
255
287
320
# of rows (approx.)
13 M
26 M
39 M
52 M
65 M
77 M
90 M
103 M
116 M
129 M
size (GB) in CSV
0.57
1.2
1.7
2.3
2.9
3.4
4.0
4.5
5.1
5.7
Table 2. Table 6: NY weather stations from 2005 to 2014.
# of years
1
2
3
4
5
6
7
8
9
10
# of stations
229
306
370
441
484
542
595
643
807
874
# of rows (approx.)
4 M
11 M
19 M
27 M
36 M
49 M
63 M
79 M
99 M
124 M
size (GB) in CSV
0.2
0.6
1.1
1.6
2.1
2.9
3.8
4.8
5.9
7.4
Table 3. Table 7: Weather data for 1–19 states in 2012.
states
DE,
+NY
+MD
+NJ,
+MA,
+LA,
+ME,
+NH,
+MS,SC,
+KY,
GA
RI
CT
VT
WV
NC
ND
SD
# of states
2
3
4
6
8
10
12
14
17
19
# of stations
408
659
1120
1476
1875
2305
2669
3019
3508
4037
# of rows (approx.)
17 M
32 M
41 M
52 M
67 M
81 M
93 M
106 M
121 M
141 M
size (GB) in CSV
0.9
1.9
2.5
3.1
4.0
4.8
5.5
6.4
7.2
8.3
Table 4. Table 10: Number of the results returned from the Siemens queries.
32
64
96
128
159
191
223
255
287
320
ActivePowerTrip
324
648
970
1294
1618
1940
2264
2588
2912
3236
NormalStop
648
1296
1940
2588
3236
3880
4528
5176
5824
6472
NormalStart
162
324
485
647
809
970
1132
1294
1456
1618
NormalRestart
0
0
0
0
0
0
0
0
0
0
Table 5. Table 11: Number of the results returned from the NY weather stations from 2005 to 2014.
1
2
3
4
5
6
7
8
9
10
ShoweryPatternCounty
530
1221
1802
2647
3609
4349
5204
5912
6639
7655
HurricaneAffectedState
2
4
5
5
5
8
9
801
1523
1533
HeatAffectedCounty
0
5
7
14
21
33
39
51
57
59
CyclonePatternState
914
1574
1617
1851
1936
2139
2246
2307
2333
2359
Table 6. Table 12: Number of the results returned from the Weather data for 1–19 states in 2012.
1
2
4
6
8
10
12
14
17
19
ShoweryPatternCounty
3769
4481
4928
10349
12709
13681
14470
14933
16381
16883
HurricaneAffectedState
2
784
789
789
790
790
798
811
813
813
HeatAffectedCounty
53
65
81
84
88
98
100
117
142
224
CyclonePatternState
9109
9179
9593
17577
30203
38421
40769
43662
54199
56303
Table 7. Table 13: Siemens data for one turbine.
# of months
32
64
96
128
159
191
223
255
287
320
# of rows
12,935,
25,871,
38,726,
51,662,
64,597,
77,453,
90,389,
103,324,
116,260,
129,195,
538
076
765
303
841
530
068
606
144
682
CSV
size (GB)
0.57
1.2
1.7
2.3
2.9
3.4
4.0
4.5
5.1
5.7
PostgreSQL
raw size (GB)
0.7
1.4
2.2
2.9
3.7
4.4
5.2
5.9
6.7
7.4
total size (GB)
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
Parquet
size (GB)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Table 8. Table 14: NY weather stations from 2005 to 2014.
# of years
1
2
3
4
5
6
7
8
9
10
# of stations
229
306
370
441
484
542
595
643
807
874
# of rows
3,969,
10,959,
18,614,
26,622,
35,862,
49,115,
63,469,
79,032,
99,221,
124,001,
455
978
686
218
560
307
733
846
419
260
CSV
size (GB)
0.2
0.6
1.1
1.6
2.1
2.9
3.8
4.8
5.9
7.4
PostgreSQL
raw size (GB)
0.3
0.8
1.4
2.0
2.7
3.7
4.9
6.1
7.7
11.0
total size (GB)
0.4
1.1
2.0
2.9
3.9
5.4
7.1
8.9
11.0
14.0
Parquet
size (GB)
0.03
0.08
0.15
0.2
0.3
0.4
0.5
0.6
0.8
0.9
Table 9. Table 15: Weather data for 1–19 states in 2012.
M,t⊨ν\leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱA iff M,s⊨νA for some s with s−t∈ϱ,
M,t⊨ν\leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱAiffM,s⊨νA for some s with t−s∈ϱ,
M,t⊨ν\leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱAiffM,s⊨νA for some s with t−s∈ϱ,
M,t⊨νAUϱA′iffM,t′⊨νA′ for some t′ with t′−t∈ϱ and M,s⊨νA for all s∈(t,t′),
M,t⊨νASϱA′iffM,t′⊨νA′ for some t′ with t−t′∈ϱ and M,s⊨νA for all s∈(t′,t),
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Data Management and Algorithms
Free University of Bozen-Bolzano, Italy
\AND\nameVladislav Ryzhikov \[email protected]
\addrDepartment of Computer Science and Information Systems
Birkbeck, University of London, UK
\AND\nameGuohui Xiao \[email protected]
\addrKRDB Research Centre
Faculty of Computer Science
Free University of Bozen-Bolzano, Italy
\AND\nameMichael Zakharyaschev \[email protected]
\addrDepartment of Computer Science and Information Systems
Birkbeck, University of London, UK
Abstract
We propose a novel framework for ontology-based access to temporal log data using a datalog extension datalogMTL of the Horn fragment of the metric temporal logic MTL. We show that datalogMTL is ExpSpace-complete even with punctual intervals, in which case full MTL is known to be undecidable. We also prove that nonrecursive datalogMTL is PSpace-complete for combined complexity and in AC0 for data complexity. We demonstrate by two real-world use cases that nonrecursive datalogMTL programs can express complex temporal concepts from typical user queries and thereby facilitate access to temporal log data. Our experiments with Siemens turbine data and MesoWest weather data show that datalogMTL ontology-mediated queries are efficient and scale on large datasets.
1 Introduction
In this paper, we present a new ontology-based framework for querying temporal log data. We begin by outlining this framework in the context of data gathering and analysis at Siemens, a leading manufacturer and supplier of systems for power generation, power transmission, medical diagnosis, and industry automation.
1.1 Data gathering at Siemens
For the Siemens equipment, analytics services are usually delivered by remote diagnostic centres that store data from the relevant industrial sites or individual equipment around the globe.
The analytics provided at these centres falls into three categories: descriptive, predictive, and prescriptive. Descriptive analytics describes or quantifies in
detail what has happened after an event. Predictive analytics aims to
anticipate events before they occur and provide a window of
opportunity for countermeasures. Prescriptive analytics aims to automate the process of suggesting underlying reasons for the predicted events and carrying out appropriate countermeasures.
All these types of analytics heavily rely on the ability to recognise interesting events using sensor measurements or other machine data such as the power output of a gas turbine, its maximum rotor speed, average exhaust temperature, etc.
For example, a service engineer at a Siemens remote diagnostic centre could be interested in active power trips of the turbine, that is, events when
(ActivePowerTrip)
the active power was above 1.5MW
for a period of at least 10 seconds, maximum 3 seconds after which
there was a period of at least one minute where the active power was
below 0.15MW.
Under the standard workflow, when facing the task of finding the
active power trips of the turbine, the engineer would call an IT
expert who would then produce a specific script (in a proprietary
signal processing language developed by Siemens) such as
[TABLE]
for the turbine aggregated data stored in a table TB_Sensor, which looks as follows:
[TABLE]
The result of running the script is a log with records such as
“2015-04-04 12:22:17 activepowertrip tb0”
where information about all the events is accumulated.
When facing the same task but for a different turbine, the engineer may have to call the IT expert once again because different models of turbines and sensors may have different log/database formats. Moreover, the storage platform for the sensor data often changes (thus, currently Siemens are pondering over migrating certain data to a cloud-based storage). Maintaining a set of scripts, one for each data source, does not provide an efficient solution since a query such as ‘find all the turbines that had an active power trip in May 2017’ would require an intermediate database with integrated data of active power trips.
Another difficulty is that the definitions of events the engineer is interested in can also change. Some changes are minor, say the pressure threshold or the number of seconds in the active power trip definition, but some could be more substantial, such as ‘find the active power trips that were followed by a high pressure within 3 minutes that lasted for 30 seconds’. This modification would require rewriting the script above into a much longer one rather than using it as a module in the new definition.
The permanent involvement of an IT expert familiar with database technology incurs high costs for Siemens, and data gathering accounts for a major part of the time the service engineers spend at Siemens remote diagnostic centres, most of which due to the indirect access to data.
1.2 Ontology-based data access
Ontology-based data access (OBDA for short) offers a different workflow that excludes the IT middleman from data gathering (?); consult also the recent survey by ? (?). In a nutshell, the OBDA workflow in the Siemens context looks as follows. Domain experts develop and maintain an ontology that contains terms for the events the engineers may be interested in. IT experts develop and maintain mappings that relate these terms to the database schemas. The engineer can now use familiar terms from the ontology and a graphical tool such as OptiqueVQS (?) to construct and run queries such as ActivePowerTrip(tb0)@x. The task of the OBDA system such as Ontop (?, ?) will be, using the mappings, to rewrite the engineer’s ontology-mediated query into an SQL query over the database and then execute it returning the time intervals x where the turbine with the ID tb0 had active power trips.
Unfortunately, the ontology and query languages designed for OBDA and standardised by the W3C—the OWL 2 QL profile of OWL 2 and SPARQL—are not suitable for the Siemens case because they were not meant to deal with essentially temporal data, concepts and properties. There have been several attempts to develop temporal OBDA.
One approach is to use the same OWL 2 QL as an ontology language, assuming that ontology axioms hold at all times, and extend the query language with various temporal operators (?, ?, ?, ?, ?, ?, ?). Unfortunately, OWL 2 QL is not able to define the temporal feature of ‘active power trip’, and so the engineer would have to capture it in a complex temporal query (or call an expert in temporal logic). Another known approach is to allow the temporal operators of the linear-time temporal logic LTL in both queries and ontologies (?, ?, ?). For more details and further references, consult the recent survey by ? (?)111Surveys of early developments in temporal deductive databases are given by ? (?) and ? (?)..
However, standard LTL over a discrete timeline such as (N,≤) or (Z,≤) is not able to adequately represent the temporal data and knowledge in the Siemens use case because measurements are taken and sent asynchronously by multiple sensors at irregular time intervals, which can depend on the turbine model, sensor type, etc. To model measurements and events using discrete time, one could take a sufficiently small time unit (quantum), say 1 second, and encode ‘active power was below 0.15MW for a period of one minute’ by an LTL-formula of the form {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle P}p\land{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle P}^{2}p\land\dots\land{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle P}^{60}p, where {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle P} is the previous-time operator. One problem with this encoding is that it is clearly awkward, not succinct, and only works under the assumption that the active power is measured each and every second. If, for some reason, a measurement is missing as in TB_Sensor, the formula becomes inadequate.
This problem can be solved by using the (more succinct) metric temporal logic MTL with operators like ⊟[1,60] interpreted as ‘at every time instant within the previous minute when a measurement was taken’. The satisfiability problem for the description logic ALC extended with such operators over (N,≤) was investigated by Gutiérrez-Basulto, Jung, and Ozaki (?).
A more fundamental issue with modelling turbine events using discrete time is that it only applies to data complying with the chosen quantum and requires amendments every time the quantum has to be set to a different value because of a new equipment or because asynchronous sensor measurements start to happen more frequently.
Thus, a better way of modelling the temporal data and events under consideration is by means of a suitable fragment of MTL interpreted over dense time such as the rationals (Q,≤) or reals (R,≤). This would allow us to capture, for example, that one event, say a sharp temperature rise, happened just before (maybe a fraction of a quantum), and so possibly caused another event, say an emergency shutdown, which is a typical feature of an asynchronous behaviour of real-time systems where the actual time of event occurrences cannot be predicted at the modelling stage.
1.3 Metric Temporal Logic
The metric temporal logic MTL
was originally designed for modelling and reasoning about real-time systems (?, ?). MTL is equipped with two alternative semantics, pointwise and continuous (aka interval-based). In both semantics, the timestamps are taken from a dense timeline (T,≤) such as (Q,≤) or (R,≤). Under the pointwise semantics, an interpretation is a timed word, that is, a finite or infinite sequence of pairs (Σi,ti), where Σi is a subset of propositional variables that are assumed to hold at ti∈T and ti<tj for i<j. Under the continuous semantics, an interpretation is an assignment of a set of propositional variables to eacht∈T. MTL allows formulas such as ⊞[1.5,3]φ (or \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture[1.5,3]φ) that holds at a moment t if and only if φ holds at every (respectively, some) moment in the interval [t+1.5,t+3]. However, under the pointwise semantics, t must be a timestamp from the timed word and φ must only hold at every (respectively, some) ti with 1.5≤ti−t≤3. Thus, ⊞[1,1]⊥ is satisfiable under the pointwise semantics, for example, by a timed word with ti+1−ti>1, but not under the continuous semantics.
In the Siemens case, we assume that the real-time system is being continuously monitored, the result of the next measurement of a sensor is only recorded
when it exceeds the previous one by some fixed margin, and events such as active power trip can happen between measurements. This makes the continuous semantics a natural choice for temporal modelling.
The satisfiability problem for MTL under this semantics turns out to be undecidable (?) and ExpSpace-complete if the punctual operators such as \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture[1,1] are disallowed (?); see also the work by ? (?, ?). Note that, under the pointwise semantics, MTL is decidable over finite timed words, though not primitive recursive (?).
1.4 Our contribution
Having analysed two real-world scenarios of querying asynchronous real-time systems (to be discussed in Section 6), we came to a conclusion that a basic ontology language for temporal OBDA should contain datalog rules with MTL operators in their bodies. In this language, for example, the event of active power trip can be defined by the rule
[TABLE]
The variables of the predicates in such rules range over a (non-temporal) object domain. Thus, the intended domain for v in (1) comprises turbines, their parts, sensors, etc. The underlying (dense) timeline is implicit: we understand (1) as saying that ActivePowerTrip(v) holds at any given time instant t if the pattern shown in the picture below has occurred before t:
Unlike model-checking liveness properties (that some events eventually
happen) in transition systems, our task is to query historical data
for events that have already happened and are actually implicitly
recorded in the data. As a consequence, we do not need ontology axioms
with eventuality operators in the head such as
\leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture[0,3s]ShutDown(v)←ActivePowerTrip(v) saying that an active power trip must be
followed by a shutdown within 3 seconds. OWL 2 QL allows existential
quantification in the head of rules such as
∃uhasRotor(v,u)←Turbine(v)
stating that every turbine has a rotor. Although axioms of this sort are
present in the Siemens turbine configuration ontology (?), we opted not to include ∃ in the head of rules in our language. On the one hand, we have not found meaningful queries in the use cases for which such axioms would provide more answers. On the
other hand, it is known that existential axioms may considerably increase the
combined complexity of both
atemporal (?, ?) and temporal
ontology-mediated query
answering (?). For these reasons, we
do not allow existential rules in our ontology language and leave
their investigation for future work.
The resulting temporal ontology language can be described as a datalog extension of the Horn fragment of MTL (without diamond operators in the head of rules). We denote this language by datalogMTL and prove in Section 3 that answering ontology-mediated queries of the form (Π,Q(v)@x) is ExpSpace-complete for combined complexity, where Π is a datalogMTL program, Q(v) a goal with individual variables v, and x a variable over time intervals during which Q(v) holds. On the other hand, we show that hornMTL becomes undecidable if the diamond operators are allowed in the head of rules.
We also prove that answering propositional datalogMTL queries is P-hard for data complexity. To compare, recall that answering ontology-mediated queries with propositional (not necessarily Horn) LTL ontologies is NC1-complete for data complexity (?).
From the practical point of view, most interesting are nonrecursive datalogMTL queries. We show in Section 4 that answering such queries is in AC0 for data complexity (assuming that data timestamps and the ranges of the temporal operators in datalogMTL programs are represented as finite binary fractions) and PSpace-complete for combined complexity (even NP-complete if the arity of predicates is bounded). In this case, we develop a query answering algorithm that can be implemented in standard SQL with window functions. We also present in Section 5 a framework for practical OBDA with nonrecursive datalogMTL queries and temporal log data stored in databases as shown above. Finally, in Section 6, we evaluate our framework on two use cases. We develop a datalogMTL ontology for temporal concepts used in typical queries at Siemens (e.g., NormalStop that takes place if events ActivePowerOff, MainFlameOff, CoastDown6600to1500, and CoastDown1500to200 happen in a certain temporal pattern). We also create a weather ontology defining standard meteorological concepts such as Hurricane (HurricaneForceWind, wind with the speed above 118 km/h, lasting at least 1 hour).
Using Siemens sensor databases and MesoWest historical records of the weather stations across the US, we experimentally demonstrate that our algorithm is efficient in practice and scales on large datasets of up to 8.3GB. We used two systems, PostgreSQL and Apache Spark, to evaluate our SQL programs. To our surprise, Apache Spark achieved tenfold better performance on the weather data than PostgreSQL. This effect can be attributed to the capacity of Spark to parallelise query execution as well as to the natural ‘modularity’ of weather data by location.
An extended abstract of this paper was presented at AAAI-17 (?).
2 DatalogMTL
In the standard metric temporal logic MTL (?), the temporal domain is the real numbers R, while the intervals ϱ in the constrained temporal operators such as \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱ (sometime in the future within the interval ϱ from now) have natural numbers or ∞ as their endpoints. In the context of the applications of MTL we deal with in this paper, it is more natural to assume that the endpoints of ϱ are non-negative dyadic rational numbers—finite binary fractions222In other words, a dyadic rational is a number of the form n/2m, where n∈Z and m∈N. such as 101.011—or ∞. We denote the set of dyadic rationals by Q2 and remind the reader that Q2 is dense in R and, by Cantor’s theorem, (Q2,<) is isomorphic to (Q,<).
By an interval, ι, we mean any nonempty subset of Q2 of the form [t1,t2], [t1,t2), (t1,t2] or (t1,t2), where t1,t2∈Q2∪{−∞,∞} and t1≤t2. We identify (t,∞] with (t,∞), [−∞,t] with (−∞,t], etc.
A range, ϱ, is an interval with non-negative endpoints. The temporal operators of MTL take the form ⊞ϱ, \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱ and Uϱ, which refer to the future, and ⊟ϱ, \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱ and Sϱ, which refer to the past.
The end-points of intervals and ranges are assumed to be represented in binary.
An individual term, τ, is an individual variable, v, or a constant, c. As usual, we assume that there is a countably-infinite list of predicate symbols, P, with assigned arities. A datalogMTL program, Π, is a finite set of rules of the form
[TABLE]
where k≥1, each Ai(1≤i≤k) is either an inequality (τ=τ′) or defined by the grammar
[TABLE]
and A+ is given by the same grammar but without any ‘non-deterministic’ operators \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱ, \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱ, Uϱ, Sϱ. The atoms A1,…,Ak constitute the body of the rule, while A+ or ⊥ its head. As usual, we assume that every variable in the head of a rule also occurs in its body.
A data instance, D, is a finite set of facts of the form P(c)@ι, where P(c) is a ground atom (with a tuple c of individual constants) and ι an interval. The fact P(c)@ι states that P(c) holds throughout the interval ι. We denote by num(D) the set of numbers (excluding ±∞) that occur in D, and by num(Π,D) the set of number occurring in Π or D.
An interpretation, M, is based on a domainΔ=∅ for the individual variables and constants. For any m-ary predicate P, m-tuple a from Δ, and moment of time t∈R, the interpretation M specifies whether P is true on a at t, in which case we write M,t⊨P(a). Let ν be an assignment of elements of Δ to the individual terms. To simplify notation, we adopt the standard name assumption according to which ν(c)=c, for every individual constant c.
We then set inductively:
[TABLE]
[TABLE]
The picture below illustrates the semantics of the ‘future’ operators for ϱ=[d,e]:
We say that Msatisfies a datalogMTL program Π under an assignment ν if, for allt∈R and all the rules A←A1∧⋯∧Ak in Π, we have
[TABLE]
We call M a model of
Π and D and write M⊨(Π,D) if M satisfies Π under every assignment, and
M,t⊨P(c) for any P(c)@ι in
D and any t∈ι. Π and D are consistent if
they have a model.
Note that ranges ϱ in the temporal operators can be punctual [r,r], in which case ⊞[r,r]A is equivalent to \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture[r,r]A, and ⊟[r,r]A to \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture[r,r]A. We also observe that ⊤SϱA is equivalent to \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱA (that is, M,t⊨ν⊤SϱA iff M,t⊨ν\leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱA for all M, t and ν), and ⊤UϱA is equivalent to \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱA.
A datalogMTL query takes the form (Π,q(v,x)), where Π is a datalogMTL program and q(v,x)=Q(τ)@x, for some predicate Q, v is a tuple of all individual variables occurring in the terms τ, and x an interval variable. A certain answer to (Π,q(v,x)) over a data instance D is a pair (c,ι) such that c is a tuple of constants from D of the same length as v, ι an interval and, for any t∈ι, any model M of Π and D, and any assignment ν mapping v to c, we have M,t⊨νQ(τ). In this case, we write M,t⊨q(c).
If the tuple v is empty (that is, Q(τ) does not have any individual variables), then we say that ι is a certain answer to (Π,q(x)) over D.
Example 1**.**
Suppose that Π has one rule (1) and D
consists of the facts
[TABLE]
Then any subinterval of the interval
[13:01:17,13:01:18) is a certain answer to the datalogMTL query
(Π,ActivePowerTrip(tb0)@x).
Example 2**.**
We illustrate the importance of the operators S (since) and U (until) using an example inspired by the ballet moves ontology (?). Suppose we want to say that SupportBending is a move spanning from the beginning to the end of RightAndLeftSupportLowPlace provided that it is preceded by RightAndLeftSupportMiddlePlace, which ends within 3s from the beginning of the RightAndLeftSupportLowPlace, as shown below:
We can define the SupportBending move using the following rule:
[TABLE]
(note that a definition of SupportBending in datalogMTL would be problematic if only the □ and ◊ operators were available).
By answering datalogMTL queries we understand the problem of
checking whether a given pair (c,ι) is a certain answer
to a given datalogMTL query (Π,q(v,x)) over a given data
instance D. The consistency (or satisfiability) problem is to check whether a given datalogMTL program Π is consistent with a given data
instance D. As usual in database theory (?) and ontology-mediated query answering, we distinguish between the combined complexity and the data complexity of these problems: the former regards all the ingredients—Π, q(c,ι) and D—as input, while the latter one assumes that Π and q are fixed and only D and (c,ι) are the input.
Proposition 3**.**
Answering datalogMTL queries and consistency checking are polynomially reducible to the complement of each other.
Proof.
Suppose first that we want to check whether (c,ι) is a certain answer to (Π,q(v,x)) over D, where q(v,x)=Q(τ)@x and ι=[−t1,t2), t1,t2∈Q2≥0; other types of ι are considered analogously. Consider the following program Π′ and data instance D′:
[TABLE]
where P is a fresh predicate. It is readily seen that (c,ι) is a certain answer to (Π,q(v,x)) over D iff Π′ is not consistent with D′. Conversely, Π and D are consistent iff [0,0] is not a certain answer to (Π,P@x) over D, where P is a fresh [math]-ary predicate, that is, a propositional variable.
∎
We conclude this section by reminding the reader that, over the integer numbers (Z,<), MTL is as expressive as the linear temporal logic LTL with the operators {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F} (at the next moment), U (until), □F (always in the future), ◊F (some time in the future) and their past counterparts {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle P}, S, □P and ◊P. For example, the LTL-formula
{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F}A is equivalent to \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture[1,1]A and AUB under the irreflexive semantics to AU(0,∞)B; conversely, \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture[2,3]A is clearly equivalent to the LTL-formula {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F}{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F}A\lor{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F}{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F}{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F}A. However, MTL operators are more succinct, which explains why MTL-satisfiability over (Z,<) is ExpSpace-complete (?, ?) whereas LTL-satisfiability is PSpace-complete (?).
In the next section, we show that consistency checking for datalogMTL programs is ExpSpace-complete for combined complexity. It follows from Proposition 3 that answering datalogMTL queries is ExpSpace-complete as well. On the other hand, we also prove that answering propositional datalogMTL queries is P-hard for data complexity, and that the extension of datalogMTL with in the head of rules leads to undecidability.
3 Complexity of answering datalogMTL queries
Observe first that every datalogMTL program Π can be transformed (using polynomially-many fresh predicates) to a datalogMTL program in normal form that only contains rules such as
[TABLE]
and gives the same certain answers as Π over any data instance. (In particular, datalogMTL programs in normal form do not contain occurrences of the diamond operators.) For example, we can replace the rule ⊞ϱ′P(τ)←P1(τ1)∧⊟ϱP2(τ2) in Π with three rules
[TABLE]
where P′ is a fresh predicate of the same arity as P and P2′ a fresh predicate of the same arity as P2.
Moreover, we can only consider those programs and data instances where intervals take one of the following two forms:
–
(t1,t2) with t1,t2∈Q2∪{−∞,∞},
–
[t,t] with t∈Q2; such intervals are called punctual.
For example, a data instance D=D′∪{P(c)@(t1,t2]} is equivalent to the data instance
[TABLE]
in the sense that is gives the same certain answers as D, the rule P(v)←⊟(r1,r2]P′(v) is equivalent to P(v)←⊟(r1,r2)P′(v)∧⊟[r2,r2]P′(v), whereas the rule P(v)←P1(v)U(r1,r2]P2(v) is equivalent to the pair of rules
[TABLE]
We use the following notations. We assume that ⟨ is one of ( and [, while ⟩ is one of ) and ]. Given an interval ι=⟨ιb,ιe⟩ and a range ϱ, we set
[TABLE]
In other words,
ι+oϱ={t+k∣t∈ι and k∈ϱ} and
ι−oϱ={t−k∣t∈ι and k∈ϱ}. We also set
[TABLE]
We assume that ι−cϱ and ι+cϱ are only defined if r2−r1≤ιe−ιb, in which case we write ϱ⊑ι. Thus, ι−cϱ is defined if there is t′ such that t′+k∈ι, for all k∈ϱ. Symmetrically, ι+cϱ is defined if there is t′ such that t′−k∈ι. The picture below illustrates the intuition behind ι+oϱ and ι+cϱ, for non-punctual ϱ, and the difference between them:
⋂i∈Iιi=∅ to say that the intersection of the intervals ιi, for i∈I, is non-empty;
–
⋂i∈Iιi for the intersection of the intervals ιi provided that ⋂i∈Iιi=∅; otherwise ⋂i∈Iιi is undefined;
–
⋃i∈Iιi for the union of the intervals ιi provided that ⋃i∈Iιi is a single interval; otherwise ⋃i∈Iιi is undefined;
–
ιc for the closure of an interval ι, that is ιc=[ιb,ιe] for any ι=⟨ιb,ιe⟩.
Suppose now that we are given a datalogMTL program Π (in normal form)
and a data instance D. We define a (possibly infinite) set CΠ,D of atoms of the form P(c)@ι or \bot@ι that contains all answers to datalogMTL queries with Π over D. The construction is essentially the standard chase procedure from database theory (?) adapted to time intervals and the temporal operators by mimicking their semantics. The only new chase rule is coalescing (coal) that merges—possibly infinitely-many—smaller intervals into the lager one they cover. Because of this rule, our chase construction requires transfinite recursion; see also the work by ? (?) and ? (?).
Let C be some set of atoms of the form
P(c)@ι or \bot@ι from Π and D. Denote by
cl(C) the result of applying exhaustively and non-recursively the following rules
to C:
(coal)
if P(c)@ιi∈C, for all i∈I with a possibly infinite set I, and ⋃i∈Iιi is defined, then we add P(c)@⋃i∈Iιi to C;
(horn)
if P(c)←⋀i∈IPi(ci) is an instance of a rule in Π with all Pi(ci)@ιi in C and ⋂i∈Iιi=∅, then we add P(c)@⋂i∈Iιi to C; if ⊥←⋀i∈IPi(ci) is an instance of a rule in Π, then we add \bot@⋂i∈Iιi to C;
(Sϱ)
if P(c)←P1(c1)SϱP2(c2) is an instance of a rule in Π with Pi(ci)@ιi∈C for i∈{1,2}, ι1c∩ι2=∅, and ((ι1c∩ι2)+oϱ)∩ι1c=∅, then we add P(c)@((ι1c∩ι2)+oϱ)∩ι1c to C; see the picture below, where ϱ=(r1,r2);
if P(c)←⊞ϱP1(c1) is an instance of a rule in Π with P1(c1)@ι∈C and ϱ⊑ι, then we add P(c)@(ι−cϱ) to C;
(Uϱ)
if P(c)←P1(c1)UϱP2(c2) is an instance of a rule in Π with Pi(ci)@ιi∈C, ι1c∩ι2=∅ and ((ι1c∩ι2)−oϱ)∩ι1c=∅, then we add P(c)@((ι1c∩ι2)−oϱ)∩ι1c to C;
(⊟ϱ)
if P(c)←⊟ϱP1(c1) is an instance of a rule in Π with P1(c1)@ι∈C and ϱ⊑ι, then we add P(c)@(ι+cϱ) to C.
We set cl0(D)=D∪{⊤(−∞,∞)} and, for any successor ordinal ξ+1 and limit ordinal ζ,
[TABLE]
where ω1 is the first uncountable ordinal (as clω1(D) is countable, there is an ordinal α<ω1 such that clα(D)=clβ(D), for all β≥α). We regard CΠ,D as both a set of atoms of the form P(c)@ι or \bot@ι and an interpretation where, for any t∈R, any P (different from ⊥), and any tuple c of individual constants, we have CΠ,D,t⊨P(c) iff P(c)@ι∈CΠ,D and t∈ι. The domain of CΠ,D is the set ind(D)∪ind(Π) that comprises the individual constants occurring in D and Π.
We illustrate the definition above by a simple example:
Example 4**.**
Let Π have two rules P←⊟[1,1]P and Q←⊞(0,∞)P, and let D={P(0,1]}. The first ω steps of the construction of CΠ,D will produce, using the rules (⊟ϱ) and (coal), the atoms P(n,n+1] and P(0,n+1], for n<ω. In the step ω+1, (coal) will give P(0,∞) and then (⊞ϱ) will return Q@[0,∞).
Lemma 5**.**
Let Π be a datalogMTL program and D a data instance. Then, for any predicate symbol P from Π and D, any tuple c of constants from D and Π, and any interval ι,
(i)
P(c)@ι∈CΠ,D* implies M,t⊨P(c), for all t∈ι and all models M of Π and D;*
(ii)
if \bot@ι∈/CΠ,D for any ι, then CΠ,D⊨(Π,D); otherwise, Π and D are inconsistent.
Proof.
(i) Suppose that M is a model of Π and D, and that P(c)@ι∈CΠ,D. Let ξ be the smallest ordinal such that P(c)@ι∈clξ(D). We show that M,t⊨P(c) for all t∈ι by induction of ξ. If ξ=0, then P(c)@ι∈D, and since M satisfies every assertion in D, we are done. If ξ=ξ′+1 then P(c)@ι was obtained from clξ′(D) by applying one of the construction rules for CΠ,D. Suppose P(c)@ι is P(c)@⋃i∈Iιi obtained by (coal). By the induction hypothesis, M,t⊨P(c) for all t∈ιi and i∈I. Clearly, M,t⊨P(c) for all t∈⋃i∈Iιi, and so for all t∈ι. The case of (horn) is similar (with intersection in place of union).
Suppose P(c)@ι is obtained by (Sϱ) from Pi(ci)@ιi, i∈{1,2}. By the induction hypothesis, M,t⊨Pi(ci) for every t∈ιi. Take an arbitrary t∈((ι1c∩ι2)+oϱ)∩ι1c. Then there exists t′∈ι1c∩ι2 such that t−t′∈ϱ and M,t⊨P2(c2). Moreover, we have M,s⊨P1(c1) for all s∈(t′,t). Therefore, M,t⊨P1(c1)SP2(c2). If P(c)@ι is obtained by (⊞ϱ) from P1(c1)@ι, the proof is analogous by considering t∈ι−cϱ. The remaining rules are treated similarly.
(ii) Suppose \bot@ι∈/CΠ,D for any ι. By definition, D⊆CΠ,D, and so CΠ,D⊨P(c)@ι for every P(c)@ι∈D. To show that all the rules in Π are satisfied by CΠ,D, we take an assignment ν, a rule P(τ)←⋀i∈IPi(τi) from Π, and suppose that CΠ,D,t⊨νPi(τi), for all i∈I. By the definition of CΠ,D, it follows that CΠ,D,t⊨Pi(ν(τi)) and Pi(ν(τi))∈CΠ,D, for some ιi∋t. Moreover, there are ordinals ξi, i∈I, such that Pi(ν(τi))@ιi∈clξi(D). By the rule (horn), we then have P(ν(τ))@⋂i∈Iιi∈clmax{ξi∣i∈I}+1(D), from which P(ν(τ))@⋂i∈Iιi∈CΠ,D, and so CΠ,D,t⊨P(ν(τ)). Now, consider a rule ⊥←⋀i∈IPi(τi) and suppose that CΠ,D,t⊨νPi(τi), for all i∈I. By the argument above, we then should have \bot@⋂i∈Iιi∈CΠ,D, which is a contradiction. For a rule P(τ)←P1(τ1)SϱP2(τ2), take an arbitrary t and suppose that CΠ,D,t2⊨νP2(τ2) for some t2 with t−t2∈ϱ and CΠ,D,t1⊨νP1(τ1) for all t2∈(t2,t). By the construction of CΠ,D, it follows that P2(ν(τ2))@ι2∈CΠ,D for some ι2∋t2. Moreover, there are finitely many intervals ιi′, i∈I, such that (t2,t)⊆⋃i∈Iιi′ and P1(ν(τ1))@ιi′∈CΠ,D. By the rule (coal), P1(ν(τ1))@ι1∈CΠ,D for ι1=⋃i∈Iιi′. It follows then that t2,t∈ι1c, and so ι1c∩ι2=∅ and t∈((ι1c∩ι2)+oϱ)∩ι1c. Thus, by the rule (Sϱ), we have P(ν(τ))@((ι1c∩ι2)+oϱ)∩ι1c∈CΠ,D. Therefore, CΠ,D,t⊨νP(τ). The remaining rules are considered in the same manner.
That \bot@ι∈CΠ,D, for some ι, implies inconsistency of D and Π follows from (i). ∎
If \bot@ι∈/CΠ,D, we call CΠ,D the canonical (or minimal) model of Π and D. We now establish an important property of CΠ,D that will allow us to reduce consistency checking for datalogMTL programs and data to the satisfiability problem for formulas in the linear temporal logic LTL over (Z,<).
Recall that the greatest common divisor of a finite set N⊆Q (at least one of which is not 0) is the largest number gcd(N)>0 such that every n∈N is divisible by gcd(N) (in the sense that n/gcd(N)∈Z). It is known that gcd(N) always exists and gcd(N)≤∏n∈N∣n∣. It is easy to see that, for any a finite set N⊆Q2 (at least one of which is not 0), we have gcd(N)=2m, where m is the maximal natural number such that n/2m∈N is an irreducible fraction. Thus, gcd(N) can be computed and stored using space polynomial in ∣N∣ (the size of the binary encoding of N). To make further definitions simpler, it will be convenient to assume that gcd(N)=1 if N={0}.
Given a datalogMTL program Π and a data instance D, we take d=gcd(num(Π,D)). Denote by secΠ,D the set of all the intervals of the form [kd,kd] and ((k−1)d,kd), for k∈Z. Clearly, secΠ,D is a partition of Q2. We represent secΠ,D as
[TABLE]
where σ0=[0,0], σ1=(0,d), σ2=[d,d], σ3=(d,2d), σ−1=(−d,0), etc. Thus, σi is punctual if i is even and non-punctual if i is odd. We refer to the σi as sections of secΠ,D.
Lemma 6**.**
For every atom P(c) and every σ∈secΠ,D, we either have CΠ,D,t⊨P(c) for all t∈σ, or CΠ,D,t⊨P(c) for all t∈σ.
Proof.
It suffices to show that every interval ι such that P(c)@ι∈CΠ,D takes one of the following forms: (−∞,∞), ⟨dk,∞), (−∞,dk⟩, ⟨dk,dk′⟩, where k,k′∈Z. This can readily be done by induction on the construction of CΠ,D. Indeed, when applied to a set of atoms of this form, the operator cl also results in a set of such atoms.
∎
Our aim now is to encode the structure of CΠ,D given by Lemma 6 by means of an LTL-formula φΠ,D that is satisfiable over (Z,<) iff Π and D are consistent.
The LTL-formula φΠ,D contains propositional variables of the form Pc, where P is a predicate symbol from Π and D of arity m and c an m-tuple of individual constants from D and Π, as well as two additional propositional variables odd and even. We define φΠ,D as a conjunction of the following clauses, where ν is any assignment of the individual constants from D and Π to the terms in Π, and □ψ is a shorthand for □Pφ∧φ∧□Fφ:
□(Pν(τ)←i∈I⋀Piν(τi)), for every rule P(τ)←⋀i∈IPi(τi) in Π;
–
□(⊥←i∈I⋀Piν(τi)), for every rule ⊥←⋀i∈IPi(τi) in Π;
–
for every rule P(τ)←P1(τ1)SϱP2(τ2) in Π with ϱ=[r,r], we require two clauses:
[TABLE]
where {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}^{n}\varphi=\underbrace{{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle F}\dots\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}_{\!\scriptscriptstyle F}}_{n}\varphi if n>0, {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}^{0}\varphi=\varphi, and {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}^{n}\varphi=\underbrace{{\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}_{\!\scriptscriptstyle P}\dots\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}_{\!\scriptscriptstyle P}}_{|n|}\varphi if n<0;
–
for every rule P(τ)←P1(τ1)SϱP2(τ2) in Π with ϱ=(r1,r2), we require four clauses:
[TABLE]
–
for every rule P(τ)←P1(τ1)SϱP2(τ2) in Π with ϱ=(r1,∞),
[TABLE]
(recall that PSQ holds at i iff there exists k<i, such that Q holds at k and P holds at all j with k<j<i);
–
similar clauses for the rules of the form P(τ)←P1(τ1)UϱP2(τ2) (here we need the ‘until’ operator U), P(τ)←⊟ϱP1(τ1) and P(τ)←⊞ϱP1(τ1) in Π;
–
for every fact P(c)@ι in D, we need the clauses:
[TABLE]
Lemma 7**.**
(Π,D)* is consistent iff φΠ,D is satisfiable.*
Proof.
(⇒) If CΠ,D is a model of (Π,D), we define an LTL-interpretation M by taking
–
M,i⊨Pc iff CΠ,D,t⊨P(c), for all t∈σi and i∈Z, all tuples of individual constants c, and predicates P;
–
M,i⊨even, for even i∈Z;
–
M,i⊨odd, for odd i∈Z.
It is routine to check that M,0⊨φΠ,D,
taking into account that CΠ,D,t⊨P1(c1)SϱP2(c2) for some (= all) t∈σi iff the following conditions hold:
Case ϱ=[r,r]:
CΠ,D,t′⊨P2(c2), for some t′∈σi−2r/d, and CΠ,D,s⊨P1(c1) for all s∈σj such that
[TABLE]
Case ϱ=(r1,r2):
there exists σk with CΠ,D,t′⊨P2(c2), for some t′∈σk, and CΠ,D,s⊨P1(c1) for all s∈σj such that
[TABLE]
and similarly for the other temporal operators in φΠ,D.
(⇐) Suppose now φΠ,D is satisfiable. Take the canonical model M of φΠ,D with M,0⊨φΠ,D; see the work by ? (?) for details. Using the observations above, it is not hard to check that M,i⊨Pc iff CΠ,D,t⊨P(c), for all t∈σi and i∈Z, all tuples of individual constants c and predicates P. Details are left to the reader.
∎
We are now in a position to prove our first complexity result:
Theorem 8**.**
Consistency checking for datalogMTL programs is ExpSpace-complete. The lower bound holds even for propositional datalogMTL.
Proof.
We first show the upper bound. By the two lemmas above, a datalogMTL program Π is consistent with a data instance D iff the LTL formula φΠ,D is satisfiable. Thus, a consistency checking ExpSpace algorithm can first construct φΠ,D, which requires exponential time in the size of Π and D.
Indeed, the greatest common divisor of the set num(Π,D) can be computed in polynomial time. The LTL formula φΠ,D contains exponentially many clauses (as there are exponentially many assignments ν) of at most exponential size (as they contain 2t/d conjuncts or disjuncts, where t is a number from Π or D).
After that we can run a standard PSpace satisfiability checking algorithm for LTL; see, e.g., the work by ? (?).
We establish the matching lower bound by reduction of the non-halting problem for deterministic Turing machines with an exponential tape.
Let M a deterministic Turing machine that requires 2f(m) cells of the tape given an input of length m, for some polynomial f. Let n=f(m). Without loss of generality, we can assume that M never runs outside the first 2n cells.
Suppose M=(Q,Γ,#,Σ,δ,q0,qh), where Q is a finite set of states, Γ a tape alphabet, #∈Γ the blank symbol, Σ⊆Γ a set of input symbols, δ:(Q∖{qh})×Γ→Q×Γ×{L,R} a transition function, and q0,qh∈Q are the
initial and halting states, respectively. Let a=a1…am be an input for M. We construct a propositional datalogMTL program Π and a data instance D such that they are not consistent iff M accepts a.
In our encoding, we employ the following propositional variables, where a∈Γ, q∈Q:
–
Hq,a indicating that a cell is read by the head, the current state of the machine is q, and the cell contains a;
–
Na indicating that a cell is not read by the head and contains a,
–
first and last marking the first and last cells of a configuration, respectively.
The program Π consists of the following rules, for a,a′,a′′∈Γ, q,q′∈Q:
[TABLE]
where ⊞r is an abbreviation for ⊞[r,r] and similarly for ⊟r.
Let D contain the following facts:
[TABLE]
The program represents the computation of M on a as a sequence of configurations. The initial one is spread over the time instants 1,…,2n, from which the first m instants represent a and the remaining ones are #. The second configuration uses the next 2n instants (i.e., 2n+1,…,2n+2n), etc.
It is routine to check that M halts on a iff Π and D are inconsistent.
∎
Note that datalogMTL allows punctual intervals of the form [r,r] as ranges of temporal operators, and that full propositional MTL with such intervals is undecidable (?).
Now we turn to the data complexity of datalogMTL and show the following result:
Theorem 9**.**
Consistency checking and answering propositional datalogMTL queries is P-hard for data complexity (under LogSpace reductions).
Proof.
We establish this lower bound by reduction of the monotone circuit value problem, which is known to be P-complete (?). Let C be a monotone circuit with input gates having fan-in 1 and all other gates fan-in 2. We assume that the gates are enumerated by consecutive positive integers, so that if there is an edge from n to m then n<m. Let N=2k, for some k∈N, be the minimal number that is greater than or equal to the maximal gate number. We encode the computation of C on an input α by a data instance DC with the following punctual facts, where [n] stands for [n,n]:
–
V[2n+n/N], if n is an input gate and α(n)=V∈{T,F};
–
D[2n+n/N], if n is an OR gate;
–
C[2n+n/N], if n is an AND gate;
–
I0[2n+m/N],I1[2n+k/N], if n is a gate with input gates m and k.
Let ΠC be a datalogMTL program with the rules
[TABLE]
Suppose n is the output gate. Then it is straightforward to check that the value of C on α is T iff (Π,D)⊨T[2n+n/N]. This immediately implies the required hardness for the query answering problem. An example of a circuit C with an assignment α, and an initial part of the canonical model of (ΠC,DC) are shown below, with the black symbols above the timestamps indicating what is given in DC and the grey ones what is implied by ΠC:
To show P-hardness of the consistency problem, it suffices to add the fact P[2n+n/N] to DC, for a fresh P, and the axiom ⊥←P∧T to ΠC.
∎
The exact data complexity of answering propositional datalogMTL queries remains open. It is worth noting that answering ontology-mediated queries with propositional LTL ontologies is NC1-complete for data complexity (?), while answering propositional datalog queries with the Halpern-Shoham operators is P-complete for data complexity (?).
The diamond operators \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱ and \leavevmodeto13.55pt\vboxto14.42pt\pgfpicture\makeatletter\lower-7.208ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.0-3.44443pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.03.44444pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt-3.87498pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt3.87495pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@moveto0.0pt3.87495pt\pgfsys@lineto3.44444pt0.0pt\pgfsys@lineto0.0pt-3.87498pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@lineto0.0pt3.87495pt\pgfsys@stroke\pgfsys@invoke\pgfsys@moveto3.44444pt0.0pt\pgfsys@lineto-3.44443pt0.0pt\pgfsys@stroke\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpictureϱ are disallowed in the head of datalogMTL rules. Denote by \textsldatalogMTL◊ the extension of datalogMTL that allows both box and diamond operators in the head of rules. We show now that this language has much more expressive power and can encode 2-counter Minsky machines, which gives the following theorem; cf. the work by ? (?):
Theorem 10**.**
Consistency checking for propositional \textsldatalogMTL◊ programs is undecidable.
Proof.
We use some ideas of ? (?), where a non-Horn fragment of MTL was shown to be undecidable. The proof is by reduction of the undecidable non-halting problem for Minksy machines: given a 2-counter Minsky machine, decide whether it does not halt starting from [math] in both counters.
Suppose we are given a Minsky machine with counters C1 and C2 that has n−1 instructions of the form
[TABLE]
where i, j, j1 and j2 are instruction indexes, k=1,2, and the n-th instruction is
[TABLE]
We encode successive configurations of the machine using the sequence [0,4),[4,8),[8,12),… of time intervals. The current instruction index is represented by a propositional variable Pi, for 1≤i≤n, that holds at the first point, say 4m, of the interval [4m,4m+4). The current value, say k1, of the counter C1 is encoded by exactly k1 moments of time in the interval (4m+1,4m+2) where the propositional variable C holds true. Similarly, the value k2 of C2 is encoded by exactly k2 moments in the interval (4m+3,4m+4) where the propositional variable C holds true.
The initial configuration is encoded by the following data instance D, where the variable Z indicates that both counters are 0:
[TABLE]
For every i(1≤i≤n) we require the rules
[TABLE]
saying, in particular, that C cannot hold true outside the intended intervals (here N is an auxiliary variable).
To simplify notation, we use the following abbreviations: À=⊞[1,1], Á=⊞[3,3], and {\raisebox{1.07639pt}{\text{\scriptsize\bigcirc}}}=\boxplus_{[4,4]}.
The machine instructions are encoded as follows (the instructions for C2 are obtained by replacing À with Á):
[TABLE]
Here the variable CP means copying of the counter value, DC means decrementing it by 1, and IC incrementing it by 1. To achieve this, we require the following rules:
[TABLE]
We explain the intuition behind the most complex rules (8)–(10) that are used to model the increment of the counters. The rules (8) mark a new time-point with the variable N in a block located after the last C-time-point in this block (or, according the first axiom, N is placed anywhere in the block if the current value of a counter is [math]). The rules (9) insert C in the next block, where in the current block we have either C or N. The rules (10) transfer Z from the current block to the next one excluding the time-point where N holds.
Finally, we add the rule
[TABLE]
It is not hard to check that the program and data instance above are consistent iff the given 2-counter Minsky machine does not halt.
∎
The diamond operators in the head of rules can encode disjunction and thereby ruin ‘Horness’. Thus, the temporalised description logic EL with such rules is undecidable (?); cf. also the work by ? (?). The addition of diamonds in the heads to the Horn fragment of the propositional Halpern-Shoham logic HS can make a P-complete logic undecidable (?). A distinctive feature of these formalisms is their two-dimensionality (?), while propositional datalogMTL is one-dimensional. Diamonds in the head of rules also ruin FO-rewritability of answering ontology-mediated queries with temporalised DL-Lite ontologies by increasing their data complexity to coNP (?). The same construction actually shows that nonrecursive datalogMTL with binary predicates and diamonds in the heads is coNP-hard.
4 Nonrecursive datalogMTL
As none of the datalogMTL programs required in our use cases is recursive, we now consider the class datalognrMTL of nonrecursive datalogMTL programs. We first show that consistency checking (and so query answering) for datalognrMTL programs is PSpace-complete for combined complexity. Then we regard a given datalognrMTL program as fixed and reduce these problems to evaluating a (data-independent) FO(<)-formula over any given data, thereby establishing that datalognrMTL is in AC0 for data complexity.
More precisely, for a program Π, let ⋖ be the dependence relation on the predicate symbols in Π: we have P⋖Q iff Π contains a clause with P in the head and Q in the body. Π is called nonrecursive if P⋖+P does not hold for any predicate symbol P in Π, where ⋖+ is the transitive closure of ⋖. We denote by depthΠ(P) the maximal number l such that P0⋖P1⋖⋯⋖Pl=P. (Note that depthΠ(P)=0 iff either P does not occur in Π or P occurs only in the body of some rules.) The maximal depthΠ(P) over all predicates P is denoted by depth(Π).
It should be clear that, for any nonrecursive Π and any data instance D, there exists some n∈N such that cln+1(D)=cln(D)=CΠ,D. Therefore, CΠ,D is finite.
Denote by minD and maxD the minimal and, respectively, maximal finite numbers that occur in the intervals from D.
Let K be the largest number occurring in Π. We then set
[TABLE]
Let d=gcd(num(Π,D)). The next lemma will be required for our PSpace algorithm checking consistency of datalognrMTL programs.
Lemma 11**.**
Let Π be a datalognrMTL program. Then every interval ι such that P(c)@ι∈CΠ,D or ⊥(c)@ι∈CΠ,D takes one of the following forms: (−∞,∞), ⟨dk,∞), (−∞,dk⟩, ⟨dk,dk′⟩, where k,k′∈Z and Ml≤dk≤dk′≤Mr.
Proof.
That every interval in CΠ,D is of the form (−∞,∞), ⟨dk,∞), (−∞,dk⟩, ⟨dk,dk′⟩, where k,k′∈Z, was observed in the proof of Lemma 6. Thus, we only need to establish the bounds on dk and dk′. For each P, let hi(P) and lo(P) be the maximal and, respectively, minimal number dk∈Q such that P(c)@ι∈CΠ,D and dk is an end-point of ι. Note that hi(P) and lo(P) can be undefined. We are going to show that hi(P) is either undefined or hi(P)≤maxD+depthΠ(P)K. (That lo(P) is either undefined or lo(P)≥minD−depthΠ(P)K is left to the reader.) Clearly, this fact implies the required bounds on dk and dk′.
The proof is by induction on the construction of CΠ,D. Let hin(P) be the maximal dk∈Q2 such that P(c)@ι∈cln(D) and dk is an end-point of ι. We show by induction on n that either hin(P) is undefined or hin(P)≤maxD+KdepthΠ(P).
For the basis of induction, if hi0(P) is defined and P(c)@ι∈cl0(D) is an atom mentioning hi0(P), then P(c)@ι∈D and hi0(P)≤maxD. Assume next that n=n′+1. Suppose hin(P) is defined and let P(c)@ι∈cln(D) be an atom mentioning hin(P). If P(c)@ι∈cln′(D), we are done by the induction hypothesis. Otherwise, we consider how P(c)@ι was obtained. Suppose it was obtained by (coal) with ι=⋃i∈Iιi. By the induction hypothesis, hin′(P)≤maxD+KdepthΠ(P), and so every number mentioned in {ιi∣i∈I} does not exceed maxD+KdepthΠ(P). Thus, we have hin(P)≤maxD+KdepthΠ(P). Now suppose that P(c)@ι was obtained by (horn) from Pi(ci)@ιi, i∈I. Observe that depthΠ(Pi)<depthΠ(P) and, by the induction hypothesis, hin′(Pi)≤maxD+KdepthΠ(Pi). Since ι=⋂i∈Iιi, the maximal number mentioned in ι cannot exceed maxD+KdepthΠ(P). Thus, hin(P)≤maxD+KdepthΠ(P).
Consider now the case when P(c)@ι was obtained by applying (Sϱ) to Pi(ci)@ιi, i∈{1,2}. By the induction hypothesis, the largest number mentioned in ιi does not exceed maxD+KdepthΠ(Pi). On the other hand, depthΠ(Pi)<depthΠ(P) and the maximal number in ι cannot be larger that the maximal number in {ιi∣i∈{1,2}} plus K. Thus, the maximal number in ι does not exceed
[TABLE]
and so hin(P)≤maxD+KdepthΠ(P). The remaining temporal rules are similar and left to the reader.
∎
Suppose we are given a datalognrMTL program Π and a data instance D. If Π and D are inconsistent then, by Lemmas 5 and 11, we have \bot@ι∈CΠ,D, for some ι of the form (−∞,∞), ⟨dk,∞), (−∞,dk⟩, ⟨dk,dk′⟩, where k,k′∈Z and Ml≤dk≤dk′≤Mr. Thus, there is a derivation of \bot@ι from Π and D, that is, a tree whose root is \bot@ι, whose leaves are some atoms from D, and whose every non-leaf vertex results from applying one of the rules (coal), (horn), (Sϱ), (⊞ϱ), (Uϱ), (⊟ϱ) to the immediate predecessors of this vertex.
Lemma 12**.**
If \bot@ι∈CΠ,D then there is a derivation of \bot@ι from Π and D such that
(i)
the length of any branch in the derivation does not exceed 2∣Π∣;
(ii)
for some polynomial p, every non-leaf vertex, corresponding to the application of (coal) in the derivation, has at most 2p(∣Π∣,∣D∣) immediate predecessors.
Proof.
To show (i), it suffices to recall that Π is non-recursive (and so none of the rules in Π can be applied twice in the same branch of the derivation) and observe that we can always replace multiple successive applications of the rule (coal) with a single application.
Consistency checking for datalognrMTL programs is PSpace-complete for combined complexity. The lower bound holds even for propositional datalognrMTL.
Proof.
The upper bound is established by a standard algorithm (?, ?) using Lemma 12 and Savitch’s theorem according to which \textscNPSpace=\textscPSpace. In essence, the NPSpace algorithm guesses branches of the derivation one by one and keeps only last two branches in memory. By Lemma 12(i), each branch contains ≤2∣Π∣ atoms of the form P(c)@ι, where ι is as in Lemma 11, and so is stored in polynomial space. In addition, we store the axioms in Π that created these atoms, or (coal) if the atom was obtained by coalescing. In the latter case, we also need to guess a number k indicating how many distinct intervals are coalesced to obtain ι. By Lemma 12(ii), k≤2p(∣Π∣,∣D∣), and so it can be stored in polynomial space.
The lower bound is proved by reduction of the satisfiability problem for quantified Boolean formulas (QBFs), which is known to be PSpace-complete. Let φ=Qnpn…Q0p0φ0 be a QBF, where each Qi is either ∀ or ∃, and φ0=c0∧⋯∧cm is a propositional formula in CNF with ci=l0∨⋯∨lk, with each li being either a variable pj or its negation ¬pj, for 0≤j≤n. In our datalognrMTL program, we use the following propositional variables:
–
P0,…,Pn (to represent p0,…,pn from φ);
–
Pˉ0,…,Pˉn (to represent ¬p0,…,¬pn);
–
P00,…,P0n for p0; P11,…,P1n for p1, etc.; Pnn for pn, and similarly for ¬pi;
–
F0,…,Fn+1;
–
C0,…,Cm (to represent c0,…,cm).
We first take a data instance D with the following facts:
[TABLE]
Starting from this data, we can generate all the truth-assignments for the variables p0,…,pn using the following rules, where 0≤i≤n:
[TABLE]
The canonical model for D and the rules above for the variables p0,p1,p2 (thus, n=2) is shown in Fig. 1.
We then need the rules:
[TABLE]
for 0≤i≤m, 0≤j≤n. Note that F0 will hold at the moments of time corresponding to the assignments that make φ0 true. Further, we consider the formula φi=Qi−1pi−1…Q0p0φ0, for 1≤i≤n+1 (note that φn+1=φ), and provide rules that make Fi true precisely at the moments of time corresponding to the assignments that make φi true. We take
[TABLE]
for 0≤i≤n, and, finally,
[TABLE]
All the rules above form the required datalognrMTL program Π. We now prove that Π is consistent with D iff φ is not satisfiable. By Lemma 5, it suffices to show that Fn+1@[0,2n+1)∈CΠ,D iff φ is satisfiable. For (⇒), suppose Fn+1@[0,2n+1)∈CΠ,D. If Qn=∃ then, in view of (14), either Fn@[0,2n),Pn@[0,2n)∈CΠ,D or Fn@[2n,2n+1),Pˉn@[2n,2n+1)∈CΠ,D. If the first option holds, we show that φn is satisfiable when pn is true; if the second option holds, we show that φn is satisfiable when pn is false. Similarly, if Qn=∀, then by (15), we have Fn@[0,2n),Pn@[0,2n)∈CΠ,D and Fn@[2n,2n+1),Pˉn@[2n,2n+1)∈CΠ,D. In this case, we show that φn is satisfiable when pn can be both false and true. To show that Fn@[0,2n),Pn@[0,2n)∈CΠ,D implies that φn is satisfiable when pn is true (the other case is analogous and left to the reader), suppose Qn−1=∃. By (14), either Fn−1@[0,2n−1),Pn−1@[0,2n−1)∈CΠ,D or Fn@[2n−1,2n),Pˉn−1@[2n−1,2n)∈CΠ,D. (If Qn−1=∀, by (14) both of these options hold.) Therefore, to show that φ is satisfiable, it now suffices to show that (i) Fn−1@[0,2n−1),Pn−1@[0,2n−1)∈CΠ,D implies that φn−1 is satisfiable when pn is true and pn−1 is true; (ii) Fn−1@[2n−1,2n),Pˉn−1@[2n−1,2n)∈CΠ,D implies that φn−1 is satisfiable when pn is true and pn−1 is false. We only consider (i), leaving (ii)
to the reader, and after applying the argument above n times, will need to show that (i) F0@[0,1),P0@[0,1)∈CΠ,D implies that φ0 is satisfiable when pn,…,p1 and p0 are all true; (ii) F0@[1,2),Pˉ0@[1,2)∈CΠ,D implies that φ0 is satisfiable when pn,…,p1 are true while p0 is false. That (i) holds follows from (11)–(13), and similarly for (ii). This concludes the proof of (⇒); the other direction is proved analogously.
∎
Using the techniques of ? (?), it can be shown that nonrecursive Horn fragment of LTL is P-complete. The same complexity can be derived from the work by ? (?) for the nonrecursive Horn fragment of the Halpern-Shoham logic HS.
As we have just seen, the combined complexity of query answering drops from ExpSpace for datalogMTL to PSpace for datalognrMTL. We now show that the data complexity drops to AC0, which is important for practical query answering using standard database systems. Note that this result is non-trivial in view of Theorem 9. The crux of the proof is encoding coalescing by FO-formulas with ∀ (which is typically not needed for rewriting atemporal ontology-mediated queries).
Theorem 14**.**
Consistency checking and answering datalognrMTL queries is in AC0 for data complexity.
Proof.
We only consider a propositional datalognrMTL program Π. The proof can be straightforwardly adapted to the case of arity ≥1 by adding more (object) variables to the predicates used below. Let N be a set of comprising numbers or ∞,−∞. We use N+r as a shorthand for {t+r∣t∈N} and similarly for N−r (we assume that t+∞=∞ and t−∞=−∞).
For a propositional variable P in Π, we define two sets le(P) and ri(P) as follows:
–
le(P)=ri(P)={0} if there is no P′ such that P⋖P′;
–
otherwise, le(P) is the union of:
–
⋃i∈Ile(Pi), for each P←⋀i∈IPi in Π,
–
le(P2)+r1∪ri(P1), for each P←P1S⟨r1,r2⟩P2 in Π,
–
le(P2)−r2∪le(P1), for each P←P1U⟨r1,r2⟩P2 in Π,
–
le(P1)+r2, for each P←⊟⟨r1,r2⟩P1 in Π,
–
le(P1)−r1, for each P←⊞⟨r1,r2⟩P1 in Π,
and ri(P) is the union of:
–
⋃i∈Iri(Pi), for each P(τ)←⋀i∈IPi in Π,
–
ri(P2)+r2∪ri(P1), for each P←P1S⟨r1,r2⟩P2 in Π,
–
ri(P2)−r1∪le(P1), for each P←P1U⟨r1,r2⟩P2 in Π,
–
ri(P1)+r1, for each P←⊟⟨r1,r2⟩P1 in Π,
–
ri(P1)−r2, for each P←⊞⟨r1,r2⟩P1 in Π.
Using an argument that is similar to the proof of Lemma 11, one can show the following:
Lemma 15**.**
For any datalognrMTL program Π, any data instance D, and any P@⟨t1,t2⟩∈CΠ,D,
–
t1=t1′+n1, for some n1∈le(P) and some t1′ such that P′[t1′,t1′]∈D or P′(t1′,s2)∈D,
–
t2=t2′+n2, for some n2∈ri(P) and some t2′ such that P′[t2′,t2′]∈D or P′(s1,t2′)∈D.
In view of Lemma 15, we can prove Theorem 14 by constructing FO-formulas φP⟨m,n⟩(x,y) with m∈le(P) and n∈ri(P) such that, for any data instance D,
[TABLE]
where AD is the FO-structure defined below.
To slightly simplify presentation (and without much loss of generality), we assume that all numbers in num(D) are positive, and set
[TABLE]
where
–
Δ is a set of (ℓ+1)-many elements strictly linearly ordered by <, ℓ is the maximum of the number of distinct timestamps in D and the number of bits in the longest binary fraction in D (excluding the binary point); for simplicity, we assume that Δ={0,…,ℓ}, < is the natural order, and denote by nˉ the nth fraction in (num(D),<), counting from 0;
–
Pi[](n,n) holds in AD iff Pi@[nˉ,nˉ]∈D and Pi()(n,m) holds in AD iff Pi@(nˉ,mˉ)∈D, for any Pi occurring in D;
–
for nˉ=∞, bitin(n,i,0) (bitfr(n,i,0)) holds in AD iff the ith bit of the integer (respectively, fractional) part of nˉ is [math], and bitin(n,i,1) (bitfr(n,i,1)), for i∈Δ, holds in AD iff the ith bit of the integer (respectively, fractional) part of nˉ is 1 (as usual, we start counting bits from the least significant one);
–
for nˉ=∞, bitin(n,i,1) and bitfr(n,i,1) for all i∈Δ.
For example, the data instance D={P[110.001,110.001],P(10000,∞)} is given as the FO structure
[TABLE]
where Δ={0,…,6}, P[]={(0,0)}, P()={(1,2)}, and
[TABLE]
To construct the required φP⟨m,n⟩(x,y),
suppose that we have FO-formulas
–
coalP⟨m,n⟩(x,y) saying that P@⟨x+m,y+n⟩ is added to CΠ,D by an application of the rule (coal);
–
ψP⟨m,n⟩(x,y) saying that
either P@⟨x+m,y+n⟩ is added to CΠ,D because it belongs to the given data instance (in which case we can assume that m=n=0, and ⟨⟩ is either () or []),
or P@⟨x+m,y+n⟩ is added to CΠ,D as a result of an application of one of the ‘logical’ rules.
In this case we can set
[TABLE]
Using the predicate isa,b, which is ⊤ if a=b and ⊥ otherwise, we can define ψP⟨m,n⟩(x,y) as a disjunction of the following formulas:
–
is⟨,[∧is⟩,]∧ism,0∧isn,0∧P[](x,y);
–
is⟨,(∧is⟩,)∧ism,0∧isn,0∧P()(x,y);
–
for every P←1≤i≤k⋀Pi in Π,
[TABLE]
where inter⌈1m1,n1⌉1,…,⌈kmk,nk⌉k⟨m,n⟩(x,y,x1,y1,…,xk,yk) says that ⟨x+m,y+n⟩ is an intersection of ⌈1x1+m1,y1+n1⌉1,…,⌈kxk+mk,yk+nk⌉k (this formula can easily be defined in terms of the predicates x+m=y+n and x+m<y+n given below);
–
for every P←P1SϱP2 in Π, the formula σϱ,P,P1,P2⟨m,n⟩(x,y) saying that ⟨x+m,y+n⟩ is ((ι1c∩ι2)−oϱ)∩ι1c for some ι1 and ι2, where P1 and P2 hold, respectively
(we give a definition of σϱ,P,P1,P2⟨m,n⟩(x,y) in the appendix);
–
analogous formulas encoding the relevant operations on intervals for the other temporal operators.
The formula coalP⟨m,n⟩(x,y) is defined as follows:
[TABLE]
where nogapP,⟨m,n⟩l(z,x,y) is the formula
[TABLE]
and sub⟨m,n⟩⌈m′,n′⌉(x′,y′,x,y) says that ⌈x′+m′,y′+n′⌉ is a subinterval of ⟨x+m,y+n⟩. Intuitively, nogapP,⟨m,n⟩l(z,x,y) says that around the time instant z+l (that is, to the left and right of it as well as at z+l itself), their is no subinterval of ⟨x+m,y+n⟩ that is not covered by P. The five cases considered in the formula nogapP,⟨m,n⟩l(z,x,y) are illustrated in Fig. 2.
When evaluating φ⟨m,n⟩(x,y) over AD, we need to compute the truth-values of x+m=y+n and x+m<y+n (for fixed m and n). We regard the former as a formula with the predicates bitin, bitfr and < that is true just in case x=y+(n−m) if n≥m, and y=x+(m−n) otherwise. We provide a definition of x=y+c, for a positive c, in the appendix. A formula expressing x+m<y+n is constructed similarly and left to the reader.
Finally, we show how the formulas φP⟨m,n⟩(x,y) defined above can be used to check whether an interval ι=⟨ιb,ιe⟩ is a certain answer to (Π,P@x) over D. As follows from Lemma 15, if \bot@⌈t1,t2⌉∈CΠ,D then, for some m∈le(⊥), n∈ri(⊥) and some numbers t1′,t2′∈num(D) such that t1′ (t2′) occurs as the left (right) end of some interval, we have t1=t1′+m and t2=t2′+n.
Take the structure ADι that extends AD with the numbers ιb and ιe. By (16), ι is a certain answer to (Π,P@x) over D iff the formula
[TABLE]
holds true in ADι.
∎
5 Implementing datalognrMTL
Unfortunately, the (data independent) FO-rewriting (23) turns
out to be impractical because of the universal quantifier used for
coalescing in (17). It is well known that ∀ is implemented in SQL as ¬∃¬ resulting in suboptimal performance in general.
Having experimented with a few different approaches, we decided to use a materialisation (bottom-up) technique. In this section, we first present a bottom-up algorithm whose worst-case running time is linear in the number of intervals of an input data instance D, under a practically motivated assumption that the order of occurrence of the intervals in D coincides with the natural temporal order on those intervals. Then we describe how our algorithm can be implemented in SQL (with views). In particular, we consider two alternative implementations of coalescing in SQL.
5.1 Bottom-up algorithm
We first introduce some notation and obtain a few results about temporal tablesT with column names attr1,…,attrm,lpar,ledge,redge,rpar. A temporal table with m=0 will be called purely temporal.
We refer to the i-th row of T as T[i], to the value of the column attrj in the i-th row as T[i,attrj], and set T[i,attrj,…,attrk]=(T[i,attrj],…,T[i,attrk]). We assume that the columns ledge and redge store timestamps or special values for ∞,−∞, lpar stores [ or (, and rpar stores ] or ). Define an order ≺ on intervals by taking ⟨t1,t2⟩≺⌈s1,s2⌉ iff one of the following conditions holds:
–
t1<s1;
–
t1=s1, ⟨ is [, and ⌈ is (;
–
t1=s1, ⟨ and ⌈ are the same, and t2<s2;
–
t1=s1, ⟨ and ⌈ are the same, t2=s2, ⟩ is ), and ⌈ is ].
It should be clear that ≺ is a strict linear order on the set of all intervals. For example, we have [3,8)≺[4,7)≺(4,6)≺(4,7)≺(4,7]. (In fact, the results of this section will work with any other linear order over intervals.) We write T[i,lpar,ledge,redge,rpar]≺T′[j,lpar,ledge,redge,rpar] to say that the interval defined by the ith row of a temporal table T≺-precedes the interval given by the jth row of a temporal table T′.
We make the following temporal ordering assumption (or TOA), for any temporal table T with m attributes:
[TABLE]
For a purely temporal table T, this assumption means that the rows of T respect ⪯.
Let T[attrj,…,attrk] be the projection of T on the columns attrj,…,attrk that keeps only distinct tuples. We define ∣T∣o to be the cardinality of T[attr1,…,attrm] and ∣T∣t to be the cardinality of T[lpar,ledge,redge,rpar]. The first measure estimates how large the table is in terms of individual constants, while the second measure concerns the number of timepoints. For the tables of extensional predicates in our use-cases, ∣T∣o is much smaller than ∣T∣t.
We say that a table T is coalesced if it does not contain distinct tuples (c1,…,cm,⟨,t1,t2,⟩) and (c1,…,cm,⌈,t1′,t2′,⌉) such that ⟨t1,t2⟩∩⌈t1′,t2′⌉=∅. For a tuple of individual constants (c1,…,cm), let Tc1,…,cm be the set of all intervals ⟨t1,t2⟩ such that (c1,…,cm,⟨,t1,t2,⟩) occurs in T. For a set I of intervals, we then denote by coalesce(I) the (minimal) set of intervals that results from coalescing I. Finally, a coalescing of T is a minimal table, T∗, with the same columns as T such that the following condition holds:
(coalesce)
for any (c1,…,cm) in T[attr1,…,attrm] and ⟨t1,t2⟩ in coalesce(Tc1,…,cm), there exists (c1,…,cm,⟨t1,t2⟩) in T∗.
Clearly, T∗ is a coalesced table.
Lemma 16**.**
Suppose a table T satisfies TOA. Then its coalescing T∗ satisfying TOA and such that ∣T∗∣o=∣T∣o and ∣T∗∣t≤∣T∣t can be computed in time O(∣T∣o2×∣T∣t).
Proof.
Consider first a purely temporal table S that satisfies temporal ordering. There is a simple linear-time algorithm to produce a coalesced table S∗ that also satisfies temporal ordering. Indeed, initially we set ⟨b,e⟩=S[0,lpar,ledge,redge,rpar]. In a loop, we take each ⌈t1,t2⌉=S[i,lpar,ledge,redge,rpar] (clearly, ⟨b,e⟩≺⌈t1,t2⌉). If ⌈t1,t2⌉ and ⟨b,e⟩ are disjoint, we add ⟨b,e⟩ to S∗ and set ⟨b,e⟩=⌈t1,t2⌉. If they are not disjoint, we set ⟨b,e⟩=⌈t1,t2⌉∪⟨b,e⟩ and move on. It is easily checked that the resulting table S∗ is as required. Below, we refer to this algorithm as an imperative coalescing algorithm.
It only remains to explain how the algorithm above can be applied to
T in order to obtain the required complexity. Note that
∣T∣≤∣T∣o×∣T∣t and we can construct ∣T∣o-many
separate tables Tc1,…,cm, for each (c1,…,cm),
in time ∣T∣×∣T∣o. Then, we can apply the algorithm
described above to each Tc1,…,cm in time ∣T∣t and
merge the results. Therefore, the overall running time is
∣T∣×∣T∣o+∣T∣t×∣T∣o=O(∣T∣o2×∣T∣t).
∎
Before presenting our query answering algorithm, we determine the
complexity of computing temporal joins. Let T be a table with
attributes attr1,…,attrm,lpar,ledge,redge,rpar and let T′ be a table with attributes attr1′,…,attrn′,lpar,ledge,redge,rpar. A temporal join of T and T′ is a table
T′′ with attributes
attr1′′,…,attrk′′,ledge,redge,rpar such that
[TABLE]
and
(c1′′,…,ck′′,⟨,t1′′,t2′′,⟩) is in T′′
iff there exist two tuples (c1,…,cm,⌈,t1,t2,⌉) from T and (c1′,…,cn′,⌊,t1′,t2′,⌋) from T′ satisfying
the following conditions:
–
ci′′=cj, for all i,j such that attri′′=attrj;
–
ci′′=cj′, for all i,j such that attri′′=attrj′;
–
⌈t1,t2⌉∩⌊t1′,t2′⌋=∅ and ⟨t1′′,t2′′⟩=⌈t1,t2⌉∩⌊t1′,t2′⌋.
Lemma 17**.**
If T, T′ satisfy TOA, then a temporal join T′′ of T and T′
satisfying TOA and such that ∣T′′∣o≤∣T∣o×∣T′∣o,
∣T′′∣t≤∣T∣t+∣T′∣t can be computed in time
O(∣T∣o2×∣T′∣o2×(∣T∣t+∣T′∣t)).
Proof.
We first give an algorithm for computing the temporal join of purely
temporal tables S and S′. We assume that these tables are
coalesced (which can be done in time O(∣S∣) and O(∣S′∣)). The
algorithm works starting from the first tuples S[i] and S′[i′] of the tables. If
S[i]∩S′[i′]=∅, we write S[i]∩S′[i′] to
the output table S′′. Then, if S[i+1]⪰S′[i′+1], we set
i′:=i′+1 (without changing i); otherwise, i:=i+1. We
iterate until we have considered all the tuples in both
tables. Clearly, computing the full S′′ requires time
O(∣S∣+∣S′∣).
The complete algorithm for the tables T and T′ will first, similarly to the argument of Lemma 16, produce ∣T∣o-many
purely temporal tables Tc1,…,cm, for each
(c1,…,cm) occurring in T. Note that
∣Tc1,…,cm∣≤∣T∣t for each of those
tables. In the same way, we produce ∣T′∣o purely temporal tables Tc1′,…,cn′′, for
each (c1′,…,cn′) occurring in T′. It remains to apply
the temporal join algorithm described above to all pairs of tables
Tc1,…,cm and Tc1′,…,cn′′, which can be
done in the required time.
∎
Another operation on temporal tables we need is projection. Let T be
a table with column names as above and let
{attr1′,…,attrn′}⊆{attr1,…,attrm}. A projection of T on
attr1′,…,attrn′ is a table with columns
attr1′,…,attrn′,lpar,ledge,redge,rpar containing all
(c1′,…,cn′,⟨t1,t2⟩) such that some
(c1,…,cm,⟨t1,t2⟩) is in T and ci′=cj
whenever attri′=attrj.
As we have to preserve the temporal order, our algorithm for computing projections requires some attention. To show that a naïve projection does not preserve the temporal order, consider a table T with two tuples (a,[,1,1,]) and (b,[,0,0,]), which satisfies our temporal order assumption. The projection of T that removes the first column results is the table with two tuples ([,1,1,]) and ([,0,0,]), which is not ordered.
Lemma 18**.**
If T satisfies TOA, then a projection of T satisfying TOA can be computed in time O(∣T∣o2×∣T∣t).
Now, consider the union operation on pairs of tables T and T′ with the same columns that returns a table with all the tuples from the set T∪T′.
Lemma 19**.**
For any pair of tables T and T′ satisfying TOA, their union table also satisfying TOA can be computed in time O((∣T∣o2+∣T′∣o2)×(∣T∣t+∣T′∣t)).
The proofs of Lemmas 18 and 19 can be found in the appendix.
We are now in a position to describe the bottom-up query answering algorithm. Suppose we are given a program Π in normal form. Suppose also that each extensional predicate P is given by a table TP satisfying TOA. (This assumption can be made in all of our use-cases. Indeed, both tables TB_Sensor and Weather are naturally ordered by the timestamp, and our mappings (see Section 6) can be easily written in a way to take advantage of this order and produce tables T satisfying TOA.) Thus, we can assume that the given data instance D is represented by a set of TP, where each TP contains all the tuples (c1,…,cm,⟨,t1,t2,⟩) such that P(c1,…,cm)@⟨t1,t2⟩∈D.
Consider a predicate P and suppose that we have computed temporal tables TPi satisfying TOA, for each Pi with P⋖Pi (see Section 4). We assume that the TPi have (non-temporal) columns (Pi,1),…,(Pi,m). For each rule α in Π with P in the head, we compute a table TPα satisfying TOA. If α is of the form (2), we first compute the temporal join T of TP1,…,TPI (we change the names so that TPi has columns (Pi,τ1,1),…,(Pi,τm,m), where τi=(τ1,…,τm), and so all the tables TPi have distinct column names). Then we select from T only those tuples (c1,…,cn,⟨,t1,t2,⟩) for which ci=cj in case the column names for ci and cj mention the same variable x, and the tuples for which ci=a in case the column name for ci mentions the constant a. These two steps can be done in time O(∏i∣TPi∣o2×∑i∣TPi∣t), and the size of the resulting table does not exceed ∏i∣TPi∣o×∑i∣TPi∣t. It remains to perform projection in the following way. Suppose P(τ) with τ=(x1,…xm) is the head of α (if τ also contains constants, the procedure below can be easily modified). Then we keep only one column among all the columns named (Pi,xj,k), for each variable xj. It remains to rename the remaining (Pi,xj,k) to (P,j), for each j. The total time required to compute TPα is O(∏i∣TPi∣o2×∑i∣TPi∣t).
If α is of the form (4), provided that TP1 is coalesced, computing TPα reduces to using arithmetic operations for ι+cϱ, ι−cϱ, and ϱ⊑ι as in the rules (⊞ϱ)/(⊟ϱ), and projection. Therefore, TPα satisfying TOA can be computed in time ∣TP1∣o2×∣TP1∣t. Computing TPα for rules of the form (3) can be done in time O(∣TP1∣o×∣TP2∣o×(∣TP1∣t+TP2∣t)). Indeed, to construct TPα for a rule α of the form P(τ)←P1(τ1)SϱP2(τ2)), we follow the rule (Sϱ) and first produce a table TP1c with the same columns as TP1, where for each tuple of TP1, we apply the operation ⋅c to its interval. We then compute the temporal join T of TP1c and TP2 after applying the renaming described above. Then we compute T+oϱ by applying the operation +oϱ to the interval columns of each tuple in T, after which we compute the temporal join of T+oϱ and TP1c (with renaming applied to the columns of TP1c). To produce TPα, it remains to perform projection and renaming as described above. Finally, to compute TP, it is sufficient to compute the union of all TPα satisfying TOA. Thus, we obtain the following, where the degree of the rule (2) is ∣I∣, of (3) is 2, and of (4) is 1:
Lemma 20**.**
Let Π be a program and P a predicate in it such that K-many rules have P in the head, with R being the maximal degree of those rules, m the maximum of ∣TP′∣t among P′ such that P⋖P′, and n the maximum of ∣TP′∣o among those P′. Then TP is of size at most nRmRK and can be computed in time O(n2RmRK).
To compute the table for the goal Q, we iterate the described
procedure as many times as the length of the longest chain of
predicates in the dependence relation ⋖ for Π. Thus, we obtain:
Theorem 21**.**
Let m be the maximum of ∣TP∣t among the extensional predicates P, and n the maximum of ∣TP∣o among those P. The overall time required to compute the goal predicate Q of Π is exponential in the size of Π, polynomial in n, and linear in m.
Note that if all TP are extracted from one table R, as in our use-cases, then n corresponds to the number of individual tuples in R, whereas m to the number of temporal intervals. It is to be emphasised that, in practice, programs tend to be small, and the number of individual constants is also small compared to the number of temporal intervals. The theorem above explains the linear patterns in
our experiments below, where the size of individual tuples is
fixed.
5.2 Implementation in SQL
Now, we show how to rewrite a given datalognrMTL query (Π,Q(τ)@x) with Π in normal form
(2)–(4) to an SQL query computing
the certain answers (c,ι) to the query with
maximal intervals ι. We illustrate the idea by a (relatively) simple example.
Consider the datalognrMTL query (Π,HeatAffectedCounty(county)@x), where
[TABLE]
is part of the meteorological ontology from Section 6. First, we transform Π to normal form:
[TABLE]
We regard TempAbove24, TempAbove41, LocatedInCounty as extensional predicates given by the tables TTempAbove24, TTempAbove41, TLocatedInCounty. The first two of these tables have columns station_id,ledge,redge, and the third one station_id,county,ledge,redge. To simplify presentation, we omit the columns lpar and rpar used in the previous section and assume that all the temporal intervals take the form (t,t′]; see Section 6.
For each predicate P in Π, we also create a view (temporary
table) VP∗
with the same columns as TP. We set
VP∗=coalesce(TP), where coalesce is a query
that implements coalescing in SQL333It should not be confused with the standard
coalesce function in SQL that returns the first of its
arguments that is not null, or null if all of the arguments are null. We explain the idea behind this query for a temporal table T (as
mentioned above, we omit columns lpar,rpar). For
a moment of time t occurring in T, we denote by b≥(T,t) the
number of i such that T[i,ledge]≥t, and by
e≥(T,t) the number of i such that
T[i,redge]≥t; the numbers b≤(T,t) and
e≤(T,t) are defined analogously. It can be readily seen that
every t in T[ledge] such that b≥(T,t)=e≥(T,t)
is the beginning of some interval in the coalesced table
T∗. Similarly, every t′ in T[redge] such that
b≤(T,t′)=e≤(T,t′) is the end of some interval in
T∗. The coalesced intervals of T∗ can be then obtained as pairs
(t,t′′], where t is as above and t′′ is the minimum over those
t′ defined above that are ≥t. Thus, to coalesce
TTempAbove24 we first use the query
[TABLE]
which extracts the pairs (n,t), where t is as described above and station_id=n. An analogous query can be used to produce Vr, a table of pairs (n,t′), where t′ is as described above and station_id=n. Finally, we set
[TABLE]
A more efficient variant of this algorithm that uses window functions with sorting and partitioning allows us to avoid joins used, e.g., in the query Vl (?). We will refer to this algorithm in Section 7 as a standard SQL algorithm. In contrast to the imperative algorithm described in the proof of Lemma 16, this algorithm can be implemented using standard SQL operators.
In addition, for each intensional predicate P of Π, we create a view VP defined by an SQL query that reflects the definitions of P in Π. For example, we set
[TABLE]
This query implements the ι+cϱ operation for ϱ=[0,24h], and the WHERE clause checks whether ϱ⊑ι holds, where ι=(VTempAbove24∗.ledge,VTempAbove24∗.redge]. We then set VY∗=coalesce(VY) and note that the query
[TABLE]
when evaluated over the tables TTempAbove24, TTempAbove41 and TLocatedInCounty, would produce the answers to the query (Π,Y(station_id,county)@x) with maximal intervals ι=(ιb,ιe], where ιb corresponds to ledge, and ιe to redge.
We now explain how to construct queries for the concepts whose definitions involve ∧ using the example of HeatAffectedCounty:
[TABLE]
[TABLE]
where MN (MX) is the function that returns the earliest (latest) of any two given date/time values (it can be implemented in SQL as a user-defined function, or using the CASE operator). Finally, we use a query similar to (24) over VHeatAffectedCounty∗ to produce the answers to (Π,q(county,x)).
We are mostly interested in the scenario where the tables TP are not available immediately, but extracted from raw timestamped data tables R by means of mappings. In this case, we use views VP instead of TP defined over R. For example, if the raw data is stored in the table Weather, we define the view:
[TABLE]
Our general rewriting algorithm is outlined in Fig. 3,
where the function ans produces an SQL query that computes
the certain answers to (Π,Q(τ)@x) (with maximal
intervals) by evaluating the query over the input database
D.
The algorithm is a variation of the standard translation of non-recursive Datalog to
relational algebra—see, e.g., the work by ? (?)—extended with the operations on temporal intervals described above (they are underlined in Fig. 3).
It is to be noted that the ‘views’
introduced by the algorithm do not require modifying the underlying
database. They can be implemented in different ways: for example, by
using subqueries, common table expressions (CTEs), or temporary
tables. For the experiments in Section 7, we use the
last approach, where temporary tables are generated on the fly
and exist only within a transaction.
6 Use Cases
We test the feasibility of OBDA with datalognrMTL by querying Siemens turbine log data and MesoWest weather data. In this section, we briefly describe these use cases; detailed results of our experiments will presented in Section 7.
6.1 Siemens
Siemens service centres store aggregated turbine sensor data in tables such as TB_Sensor. The data comes with (not necessarily regular) timestamps t1,t2,…, and it is deemed that the values remain constant in every interval [ti,ti+1).
Using a set of mappings, we extract from these tables a data instance containing ground facts such as
[TABLE]
For example, the first two of them are obtained from the table TB_Sensor
using the following SQL mapping M:
[TABLE]
In terms of the basic predicates above, we define more complex ones
that are used in queries posed by the Siemens engineers:
[TABLE]
6.2 MesoWest
The
MesoWest (http://mesowest.utah.edu/) project makes
publicly available historical records of the weather stations across
the US showing such parameters of meteorological conditions as
temperature, wind speed and direction, amount of precipitation,
etc. Each station outputs its measurements with some periodicity, with
the output at time ti+1 containing the accumulative (e.g., for
precipitation) or averaged (e.g., for wind speed) value over the
interval (ti,ti+1].
The data comes in a table Weather, which looks as follows:
[TABLE]
One more table, Metadata, provides some atemporal meta information about the stations:
[TABLE]
The monitoring and historical analysis of the weather involves answering queries such as ‘find showery counties, where one station observes precipitation at the moment, while another one does not, but observed precipitation 30 minutes ago’.
We use SQL mappings over the Weather table similar to those in the Siemens case to obtain ground atoms such as
[TABLE]
(according to the standard definition, the hurricane force wind is above 118 km/h). On the other hand, mappings to the Metadata table provide atoms such as
[TABLE]
Our ontology contains definitions of various meteorological terms:
[TABLE]
[TABLE]
7 Experiments
To evaluate the performance of the SQL queries produced by the datalognrMTL rewriting algorithm outlined in Section 5.2, we developed two benchmarks for our use
cases.
We ran the experiments on an HP Proliant server with 2 Intel Xeon
X5690 Processors (with 12 logical cores at 3.47GHz each), 106GB of RAM
and five 1TB 15K RPM HD. We used both PostgreSQL 9.6 and the SQL
interface (?) of Apache Spark
2.1.0. Apache Spark is a cluster-computing framework that provides
distributed task dispatching, scheduling and data parallelisation. For
each of these two systems, we provided two different implementations,
imperative and standard SQL, which diverge in the computation of
maximal intervals; see
Section 5.
We ran all the queries with a timeout of 30 minutes.
7.1 Siemens
Siemens provided us with a sample of data for one running
turbine, which we denote by tb0, over 4 days in the form of the
table TB_Sensor. The data table was rather sparse, containing a lot of nulls, because different sensors recorded data at
different frequencies. For example, ActivePower arrived most frequently with average periodicity of 7 seconds, whereas the values for the field MainFlame arrived most rarely, every 1 minute on average.
We replicated this sample to imitate the data for one turbine over 10
different periods ranging from 32 to 320 months. The statistics of the
data sets are given in Tables 4 and 13.
We evaluated four queries
ActivePowerTrip(tb0)@x,
NormalStart(tb0)@x,
NormalStop(tb0)@x and
NormalRestart(tb0)@x. The statistics of returned answers is given in Table 10.
The execution times for the Siemens use case are given in
Fig. 5.
Although Apache Spark was designed to perform
efficient parallel computations, it failed to take advantage of this
feature due to the fact that the Siemens data could not be partitioned by
mapping each part to a separate core.
PostgreSQL 9.6 also supports parallel query execution in some
cases. However, as many operators (e.g., scans of temporary tables) in our queries are classified either
‘parallel unsafe’ or ‘parallel restricted’ in the parallel safety
documentation (?),
the query planner failed to produce any parallel execution strategy in
our case.
The reason why PostgreSQL outperformed Apache Spark is that the latter does not provide a convenient way to define proper indexes over temporary tables, which leads to quadratically growing running times.
On the other hand, PostgreSQL shows
linear growth
in the size of
data
(confirming theoretical results since we deal with a single turbine).
Note that the normal restart (start) query timeouts on the data for
more than 18 (respectively, 21) years, which is more than enough for
the monitoring and diagnostics tasks at Siemens, where the two most common
application scenarios for sensor data analytics are daily monitoring (that is, analytics of high-frequency data of the previous 24 hours) and
fleet-level analytics of key-performance indicators over one year. In
both cases, the computation time of the results is far less a crucial
cost factor than the lead-time for data preparation.
7.2 MesoWest
In contrast to the Siemens case, the weather
tables contain very few nulls. Normally, the data values arrive with periodicity from 1 to 20 minutes. We tested the performance of our
algorithm by increasing (i) the temporal span (with some necessary increase of the spatial spread) and (ii) the
geographical spread of data.
For (i), we took the New
York state data for the 10 continuous periods between 2005 and
2014; see Tables 6 and 14. As each year around 70 new weather stations were added, our 10 data samples increase more than linearly in size.
For (ii), we fixed the time
period of one year (2012) and linearly increased the data from 1 to 19
states (NY, NJ, MD, DE, GA, RI, MA, CT, LA, VT, ME, WV, NH, NC, MS, SC, ND,
KY, SD); see Table 7 and 15.
In both cases, we executed four datalognrMTL queries
ShoweryCounty(v)@x,
HurricaneAffectedState(NY)@x,
HeatAffectedCounty(v)@x,
CyclonePatternState(NY)@x.
The statistics of the returned answers is shown in Tables 11 and 12.
The execution times are shown in Figures 8 and 9. All the four
queries can be answered within the time limit. The most expensive
one is the cyclone pattern state query because its definition
includes a join of four atoms for winds in four directions, each with
a large volume of instances.
All the graphs in Figures 9 and 8 exhibit linear
behaviour with respect to the size of data.
The nearly tenfold better performance of Spark over PostgreSQL can be explained by the fact that, unlike the data in the Siemens case, the MesoWest data is highly
parallelisable. Since it was collected from hundreds of
different weather stations, it can be partitioned by station id,
state, county, etc. to perfectly fit the MapReduce programming model
extended with resilient distributed datasets
(RDDs) (?). In this case,
Apache Spark is able to take advantage of the multi-core and large
memory hardware infrastructure,
to compute mappings and coalescing in parallel,
making it 10 times faster than
PostgreSQL; see Figures 8
and 9.
Overall, the results of the experiments look very encouraging: our
datalognrMTL query rewriting algorithm produces SQL queries that are
executable by a standard database engine PostgreSQL in acceptable
time, and by a cluster-computing framework Apache Spark in better than
acceptable time (in case data can be properly partitioned) over large
sets of real-world temporal data of up to 8.3GB in CSV format. The
relatively challenging queries such as NormalRestart and
CyclonePatternState require a large number of temporal
joins, which turn out to be rather expensive.
8 Conclusions and Future Work
To facilitate access to sensor temporal data with the aim of monitoring and diagnostics, we suggested the ontology
language datalogMTL, an extension of datalog with the Horn fragment of
the metric temporal logic MTL (under the continuous semantics). We showed that answering datalogMTL
queries is ExpSpace-complete for combined complexity, but
becomes undecidable if the diamond operators are allowed in the head
of rules. We also proved that answering nonrecursive datalogMTL queries is PSpace-complete for combined complexity and in AC0 for data complexity. We tested feasibility and efficiency of OBDA with datalognrMTL on two real-world use cases by querying Siemens turbine data and MesoWest weather data. Namely, we designed datalognrMTL ontologies defining typical concepts used by Siemens engineers and various meteorological terms, developed and implemented an algorithm rewriting datalognrMTL queries into SQL queries, and then executed the SQL queries obtained by this algorithm from our ontologies over the Siemens and MesoWest data, showing their acceptable efficiency and scalability.
(To the best of our knowledge, this is the first work on practical OBDA with temporal ontologies, and so no other systems with similar functionalities are available for comparison.)
Based on these encouraging results, we plan to include our temporal
OBDA framework into the Ontop platform (?, ?, ?); visit http://ontop.inf.unibz.it/ for more information on Ontop. Note also that datalogMTL presented here has been recently used to develop an ontology of ballet moves (see Example 2) that underlies a search engine of annotated sequences in ballet videos (?). This is a third use case for our framework (and we are aware of a few more emerging use cases), which makes an efficient and user-friendly implementation of the framework a top priority.
We are also working on the streaming data setting, where the challenge
is to continuously evaluate queries over the incoming data.
A rule-based language with window operators for analysing streaming data has been suggested by ? (?). This language is very expressive as it uses an abstract semantics for window operators (which does not have to guarantee decidability) and allows negation and disjunction in the rules. It would be interesting to identify and adapt a suitable fragment of this language in our temporal OBDA framework.
Acknowledgements
This work was supported by the UK EPSRC grant EP/M012670 ‘iTract: Islands of Tractability in Ontology-Based Data Access’ and by the OBATS project at the Free University of Bozen-Bolzano.
Guohui Xiao is the corresponding author of this article.
The formula σϱ,P,P1,P2⟨m,n⟩(x,y) is defined as follows:
[TABLE]
where plusoϱ,⌈4m4,n4⌉4⌈5m5,n5⌉5(x5,y5,x4,y4) is an (obvious) formula saying that ⌈5x5+m5,y5+n5⌉5 is the interval ⌈4x4+m4,y4+n4⌉4+oϱ.
The formula x=y+c, for a non-negative c, is defined as follows. For c=∞, we take the formula
[TABLE]
whereas for a constant c=h/2k, we can use
[TABLE]
where predicates bit+h/2kin(y,j,v), saying that v is the j-th bit of the integer part of y+h/2k, and bit+h/2kfr(y,j,v), saying that v is the j-th bit of the fractional part of y+h/2k, are defined inductively as follows:
If T satisfies TOA, then a projection of T satisfying TOA can be computed in time O(∣T∣o2×∣T∣t).
Proof.
We first partition T into a set of purely temporal tables
Tc1,…,cm and compute the set of all individual tuples
(c1′,…,cn′) that will appear in the projection T′. Let
(c1′,…,cn′) be one such tuple, and consider the tables
Tc11,…,cm1,…,Tc1k,…,cmk such
that the projection of each (c1i,…,cmi) is precisely
(c1′,…,cn′). Clearly, we have at most ∣T∣o such
tables. It is well-known that, for a pair of ordered tables S and
S′, we can construct an ordered table that contains all the tuples
S∪S′ in time ∣S∣+∣S′∣. We use this algorithm k times to
obtain an ordered table containing all the tuples of
Tc11,…,cm1∪⋯∪Tc1k,…,cmk
in time O(k∣T∣o). We then write the tuples of the form
(c1′,…,cn′,⟨,t1,t2,⟩), where
(⟨,t1,t2,⟩) is a tuple from the united table,
into the output table. It can be readily checked that the complete
output table can be produced in the required time.
∎
Lemma**.**
For any pair of tables T and T′ satisfying TOA, their union table also satisfying TOA can be computed in time O((∣T∣o2+∣T′∣o2)×(∣T∣t+∣T′∣t)).
Proof.
We first partition T and T′ into sets of purely temporal tables Tc1,…,cm and, respectively, Tc1,…,cm′. While doing this partition, we make sure that the tables Tc1,…,cm are stored sequentially with respect to some order on the tuples (c1,…,cm) (it can be done in time ∣T∣o2×∣T∣t). We do the same for the tables Tc1,…,cm′. It remains to go through all the tuples ⟨,t1,t2,⟩ and ⌈,t1′,t2′,⌉ in all the tables Tc1,…,cm and Tc1,…,cm′ to produce the union table by an algorithm similar to the one applied to the tables S and S′ in the proof of Lemma 18.
∎
Experimental Results
Here, CSV is the size of the data in CSV format; PostgreSQL (raw size) is the size of the data itself stored in PostgreSQL reported by the pg_relation_size function; PostgreSQL (total size) is the size of the total data (including the index)
stored in PostgreSQL reported by the pg_total_relation_size
function; and Parquet is the size of the data in the Apache Parquet format, used by Apache Spark.
Bibliography50
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1Abiteboul, Hull, and Vianu Abiteboul, S., Hull, R., and Vianu, V. (1995). Foundations of Databases . Addison-Wesley.
2Alur, Feder, and Henzinger Alur, R., Feder, T., and Henzinger, T. A. (1996). The benefits of relaxing punctuality. J. ACM , 43 (1), 116–146.
3Alur and Henzinger Alur, R., and Henzinger, T. A. (1993). Real-time logics: Complexity and expressiveness. Inf. Comput. , 104 (1), 35–77.
4Armbrust, Xin, Lian, Huai, Liu, Bradley, Meng, Kaftan, Franklin, Ghodsi, and Zaharia Armbrust, M., Xin, R. S., Lian, C., Huai, Y., Liu, D., Bradley, J. K., Meng, X., Kaftan, T., Franklin, M. J., Ghodsi, A., and Zaharia, M. (2015). Spark SQL: relational data processing in spark. In Sellis, T. K., Davidson, S. B., and Ives, Z. G. (Eds.), Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015 , pp. 1383–13
5Arora and Barak Arora, S., and Barak, B. (2009). Computational Complexity: A Modern Approach (1st edition). Cambridge University Press, New York, USA.
6Artale, Kontchakov, Wolter, and Zakharyaschev Artale, A., Kontchakov, R., Wolter, F., and Zakharyaschev, M. (2013). Temporal description logic for ontology-based data access. In Proc. of the 23rd Int. Joint Conf. on Artificial Intelligence, IJCAI 2013 . IJCAI/AAAI.
7Artale, Kontchakov, Kovtunova, Ryzhikov, Wolter, and Zakharyaschev Artale, A., Kontchakov, R., Kovtunova, A., Ryzhikov, V., Wolter, F., and Zakharyaschev, M. (2015). First-order rewritability of temporal ontology-mediated queries. In Proc. of the 24th Int. Joint Conf. on Artificial Intelligence, IJCAI 2015 , pp. 2706–2712. IJCAI/AAAI.
8Artale, Kontchakov, Kovtunova, Ryzhikov, Wolter, and Zakharyaschev Artale, A., Kontchakov, R., Kovtunova, A., Ryzhikov, V., Wolter, F., and Zakharyaschev, M. (2017). Ontology-mediated query answering over temporal data: A survey (invited talk). In TIME , Vol. 90 of LIP Ics , pp. 1:1–1:37. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik.