A benchmark library for parametric timed model checking
Andr\'e \'Etienne (LIPN)

TL;DR
This paper introduces a comprehensive benchmark library for parametric timed model checking, facilitating fair comparison and evaluation of various tools and techniques in verifying real-time systems with uncertain timing constraints.
Contribution
It provides a curated collection of benchmarks, including academic, industrial, and challenging unsolvable cases, to support the development and assessment of parametric timed model checking tools.
Findings
Benchmark library includes diverse real-world and theoretical cases.
Enables systematic comparison of different verification techniques.
Highlights existing limitations and challenges in the field.
Abstract
Verification of real-time systems involving hard timing constraints and concurrency is of utmost importance. Parametric timed model checking allows for formal verification in the presence of unknown timing constants or uncertainty (e.g. imprecision for periods). With the recent development of several techniques and tools to improve the efficiency of parametric timed model checking, there is a growing need for proper benchmarks to test and compare fairly these tools. We present here a benchmark library for parametric timed model checking made of benchmarks accumulated over the years. Our benchmarks include academic benchmarks, industrial case studies and examples unsolvable using existing techniques.
| Benchmark | Ref | Domain | Scal. | L/U | Inv | SW | Prop. | Time | ||||
| Academic | ||||||||||||
| And-Or | [CC05] | Circuit | 4 | 4 | 12 | 0 | Misc. | 3.01 | ||||
| CSMA/CD | [KNSW07] | Protocol | 3 | 3 | 3 | 0 | Unavoid. | ? | ||||
| Fischer-AHV93 | [AHV93] | Protocol | 3 | 2 | 4 | 0 | L/U | Safety | 0.04 | |||
| Fischer-HRSV02:3 | [HRSV02] | Protocol | 3 | 3 | 4 | 1 | L/U | Safety | HS | |||
| Flip-flop:2 | [CC07] | Circuit | 5 | 5 | 2 | 0 | U | Misc. | 0.04 | |||
| Flip-flop:12 | [CC07] | Circuit | 5 | 5 | 12 | 0 | U | Misc. | 23.07 | |||
| idle-time-sched:3 | [LSAF14] | RTS | 8 | 13 | 2 | 3 | U | Safety | 1.49 | |||
| idle-time-sched:5 | [LSAF14] | RTS | 12 | 21 | 2 | 0 | U | Safety | 14.61 | |||
| Jobshop:3-4 | [AM01] | Sched. | 2 | 3 | 12 | 4 | Opt. reach. | 5.58 | ||||
| Jobshop:4-4 | [AM01] | Sched. | 4 | 4 | 16 | 4 | Opt. reach. | T.O. | ||||
| NP-FPS-3tasks:50-0 | [JLR13] | RTS | 4 | 6 | 2 | 0 | Safety | 1.03 | ||||
| NP-FPS-3tasks:100-2 | [JLR13] | RTS | 4 | 6 | 2 | 0 | Safety | 65.23 | ||||
| SSLAF14-1 | [PGGH98, SSL+13] | RTS | 7 | 16 | 2 | 2 | Safety | 0.33 | ||||
| SSLAF14-2 | [WTVL06, SSL+13] | RTS | 6 | 14 | 2 | 4 | Safety | T.O. | ||||
| ProdCons:2-3 | [KP12] | Prod.-cons. | 5 | 5 | 6 | 0 | L/U | Reach. | T.O. | |||
| train-AHV93 | [AHV93] | TGC | 3 | 3 | 6 | 0 | L/U | Safety | 0.01 | |||
| WFAS | [BBLS15] | Protocol | 3 | 4 | 2 | 0 | Safety | T.O. | ||||
| Industrial | ||||||||||||
| accel:1 | [HAF14, AHW18] | PTPM | 2 | 2 | 3 | 0 | PTPM | 1.25 | ||||
| accel:10 | [HAF14, AHW18] | PTPM | 2 | 2 | 3 | 0 | PTPM | 12.67 | ||||
| BRP | [DKRT97] | Protocol | 6 | 7 | 2 | 12 | Safety | 248.35 | ||||
| FMTV:1A1 | [SAL15] | RTS | 3 | 3 | 3 | 5 | Opt. reach. | 6.97 | ||||
| FMTV:1A3 | [SAL15] | RTS | 3 | 3 | 3 | 7 | Opt. reach. | 87.39 | ||||
| FMTV:2 | [SAL15] | RTS | 6 | 9 | 2 | 0 | Opt. reach. | 1.61 | ||||
| gear:1 | [HAF14, AHW18] | PTPM | 2 | 2 | 3 | 0 | PTPM | 0.77 | ||||
| gear:10 | [HAF14, AHW18] | PTPM | 2 | 2 | 3 | 0 | PTPM | 7.42 | ||||
| RCP | [CAS01] | Protocol | 5 | 6 | 5 | 6 | L/U | Reach. | 1.07 | |||
| SIMOP:3 | [ACD+09] | Automation | 5 | 8 | 3 | 0 | Reach. | T.O. | ||||
| SPSMALL:2 | [CEFX09] | Circuit | 11 | 11 | 2 | 0 | Reach. | 0.96 | ||||
| SPSMALL:26 | [CEFX09] | Circuit | 11 | 11 | 26 | 0 | Reach. | T.O. | ||||
| Toy | ||||||||||||
| toy:n | Fig. 1(b) | Toy | 1 | 2 | 1 | 0 | Reach. | HS | ||||
| toy:1/n | Fig. 1(a) | Toy | 1 | 2 | 1 | 0 | U | Reach. | HS | |||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: Université Paris 13, LIPN, CNRS, UMR 7030, F-93430, Villetaneuse, France 22institutetext: JFLI, CNRS, Tokyo, Japan 33institutetext: National Institute of Informatics, Japan
A benchmark library for parametric timed model checking††thanks: This work is partially supported by the ANR national research program PACS (ANR-14-CE28-0002)
and by ERATO HASUO Metamathematics for Systems Design Project (No. JPMJER1603), JST.
Étienne André 112233 0000-0001-8473-9555
Abstract
Verification of real-time systems involving hard timing constraints and concurrency is of utmost importance. Parametric timed model checking allows for formal verification in the presence of unknown timing constants or uncertainty (e. g., imprecision for periods). With the recent development of several techniques and tools to improve the efficiency of parametric timed model checking, there is a growing need for proper benchmarks to test and compare fairly these tools. We present here a benchmark library for parametric timed model checking made of benchmarks accumulated over the years. Our benchmarks include academic benchmarks, industrial case studies and examples unsolvable using existing techniques.
Keywords:
case studies model checking parameter synthesis parametric timed automata
1 Introduction
Verification of real-time systems involving hard timing constraints and concurrency is of utmost importance, and is now recognized in standards such as the DO-178C, that allows formal methods without addressing specific process requirements. Model checking is a popular model-based technique that formally verifies whether a model satisfies a property. Parametric timed model checking significantly enhances model checking by allowing its application earlier in the design phase, when timing constants may not be known yet. In addition, it is possible to verify systems in the presence of uncertainty, e. g., when some periods are known with some limited precision. This is the case of Thales’ FMTV111“Formal Methods for Timing Verification Challenge”, in the WATERS workshop: http://waters2015.inria.fr/ challenge 2014 where the system was characterized with uncertain but constant periods, that rules out the use of non-parametric timed model checking.
Popular formalism for parametric timed model checking include parametric timed automata (PTAs) [AHV93] and parametric time Petri nets [TLR09].
Several tools support parameters, such as HyTech [HHWT95] (parametric hybrid automata), Romeo [LRST09] (parametric time Petri nets), IMITATOR [AFKS12] (parametric timed automata), PSyHCoS [ALS*+*13] (parametric stateful timed CSP), or Symrob (robustness for timed automata) [San15]. In addition, several tools support the larger class of hybrid automata, such as PHAVer [Fre08] or SpaceEx [FLGD*+*11] and, while not explicitly supporting parameters, can encode them.222In a hybrid automaton, a parameter is a variable that can evolve for an arbitrary amount of time at rate 1, and is then “frozen” (rate 0).
Recently, a growing number of analyses and techniques were proposed to analyze parametric timed models (mainly PTAs) such as SMT-based techniques [KP12], integer hull abstractions [JLR15], corner-point abstractions [BBLS15], distributed verification [ACN15], NDFS-based synthesis [NPvdP18], machine learning [AL17, LSGA17], etc. However, despite some case studies informally shared between these works, there is a lack of a common basis to compare new tools and techniques in a fair manner. Without a stable list of benchmarks publicly available, it is difficult to assess the efficiency of a new algorithm.
Contribution
We present here a library of benchmarks containing academic and industrial case studies collected in the past few years from academic papers and industrial collaborations. In addition, a focus is made on (possibly toy) examples known to be unsolvable using current state-of-the-art techniques, with the hope to encourage the development of new techniques to solve them. Benchmarks are available online in the IMITATOR input format, and distributed using the GNU General Public License.
Related libraries
The library most related to ours is that by Chen et al., that proposes a suite of benchmarks for hybrid systems [CSBM*+*15]. However, it aims at analyzing hybrid systems, which are strictly more expressive than PTAs in theory, and incomparable in practice, as most hybrid systems do not feature timing parameters. In addition, that benchmark suite focuses only on reachability properties. Finally and most importantly, it does not focus on parameters, and the benchmarks are non-parametric. In contrast, our library focuses on parametric timed benchmarks, with various types of properties.
Another interesting library is that by Hoxha, Abbas, and Fainekos [HAF14], that offers Matlab/Simulink models of automotive systems. However, it does not aim specifically at parametric timed model checking; two of our benchmarks originally partially come from the aforementioned library [HAF14].
2 IMITATOR parametric timed automata
Parametric timed automata extend finite-state automata with clocks, i. e., real-valued variables evolving at the same rate. Clocks can be reset along transitions, and can be compared to constants or parameters (integer- or rational-valued) along transitions (“guards”) or in locations (“invariants”). IMITATOR parametric timed automata extend PTAs [AHV93] with some useful features such as synchronization between components, stopwatches (i. e., the ability to stop the elapsing of some clocks [CL00]), presence of parametric linear terms in guards, invariants and resets, shared global rational-valued variables, etc.
Example 1
Consider the PTA in Fig. 1(a), containing two locations and , two clocks {\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}x} and {\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}y}, and one parameter {\color[rgb]{1,0.6,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0.6,0.0}p}. The self-loop on can be taken whenever {\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}x}={\color[rgb]{1,0.6,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0.6,0.0}p} holds, and resets {\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}x}, i. e., can be taken every {\color[rgb]{1,0.6,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0.6,0.0}p} time units. In addition, initially, as {\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}x}={\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}y}=0 and clocks evolve at the same rate, the transition guarded by {\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}y}=1\land{\color[rgb]{0.4,0.4,1}\definecolor[named]{pgfstrokecolor}{rgb}{0.4,0.4,1}x}=0 cannot be taken. Observe that, if {\color[rgb]{1,0.6,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0.6,0.0}p}=1, then the transition to can be taken after exactly one loop on . If {\color[rgb]{1,0.6,0.0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0.6,0.0}p}=\frac{1}{2}, then the transition to can be taken after exactly two loops. In fact, the set of valuations for which is reachable is exactly .
L/U-PTAs
Lower-bound/upper-bound parametric timed automata (L/U-PTAs) [HRSV02] restrict the use of parameters: parameters must be partitioned between lower-bound parameters (always compared with clocks as lower bounds, i. e., or ) and upper-bound parameters. L/U-PTAs enjoy monotonicity properties and, while the full class of PTAs is highly undecidable [And18], L/U-PTAs enjoy some decidability results [HRSV02, BL09, ALR18b]. U-PTAs [BL09, ALR18a] are L/U-PTAs with only upper-bound parameters.
3 The benchmark library
3.1 Categories
Our benchmarks are classified into three main categories:
academic benchmarks, studied in a range of papers: a typical example is the Fischer mutual exclusion protocol; 2. 2.
industrial case studies, which correspond to a concrete problem solved (or not) in an industrial environment; 3. 3.
examples famous for being unsolvable using state-of-the-art techniques; for some of them, a solution may be computed by hand, but existing automated techniques are not capable of computing it. This is the case of the PTA in Fig. 1(a), as a human can very easily solve it, while (to the best of our knowledge) no tool is able to compute this result automatically.
Remark 1
Our library contains a fourth category: education benchmarks, that consist of generally simple case studies that can be used for teaching. This category contains toy examples such as coffee machines. We omit this category from this paper as these benchmarks generally have a limited interest performance wise.
The domain of the benchmarks are hardware asynchronous circuits, communication or mutual exclusion protocols, real-time systems (“RTS”) and schedulability problems, parametric timed pattern matching (“PTPM”), train-gate-controllers models (“TGC”), etc.
In addition, we use the following classification criteria:
- •
number of variables: clocks, parameters, locations, automata;
- •
whether the benchmark (in the provided version) is easily scalable, i. e., whether one can generate a large number of instances; for example, protocols often depend on the number of participants, and can therefore be scaled accordingly;
- •
presence of shared rational-valued variables;
- •
presence of stopwatches;
- •
presence of location invariants, as some works (e. g., [AHV93, ALR18a]) exclude them;
- •
whether the benchmark meets the L/U assumption.
3.2 Properties
We consider the three following main properties:
reachability / safety:
synthesize parameter valuations for which a given state of the system (generally a location, but possibly a constraint on variables) must be reachable / avoided (see e. g., [JLR15]).
optimal reachability:
same as reachability, but with an optimization criterion: some parameters (or the time) should be minimized or maximized.
unavoidability:
synthesize parameter valuations for which all runs must always eventually reach a given state (see e. g., [JLR15]).
robustness:
synthesize parameter valuations preserving the discrete behavior (untimed language) w.r.t. to a given valuation (see e. g., [ACEF09, San15]).
In addition, we include some recent case studies of parametric timed pattern matching (“PTPM” hereafter), i. e., being able to decide for which part of a log and for which values of parameters does a parametric property holds on that log [AHW18]. Finally, a few more case studies have ad-hoc properties (liveness, properties expressed using observers [ABBL98, And13], etc.), denoted “Misc.” later on.
3.3 Presentation
The benchmark library comes in the form of a Web page that classifies models and is available at https://www.imitator.fr/library.html.
The library is made of a list of a set of benchmarks. Each benchmark may have different models: for example, Flip-flop comes with three models, one with 2 parameters, one with 5, and one with 12 parameters. Similarly, some Fischer benchmarks come with several models, each of them corresponding to a different number of processes. Finally, each model comes with one or more properties. For example, for Fischer, one can either run safety synthesis, or evaluate the robustness of a given reference parameter valuation.
The first version of the library contains 34 benchmarks with 80 different models and 122 properties.
3.4 Performance
We present a selection of the library in Table 1. Not all benchmarks are given; in addition, most benchmarks come with several models and several properties, omitted here for space concern. We give from left to right the number of automata, of clocks, of parameters, of discrete variables, whether the model is an L/U-PTA, a U-PTA or a regular PTA, whether it features invariants and stopwatches, the kind of property, and a computation time on an Intel i7-7500U CPU @ 2.70GHz with 8 GiB running Linux Mint 18.
“T.O.” denotes time-out (after 300 s). “?” denotes unsolvable, because no such algorithm is implemented in existing tools. “HS” denotes time-out but human-solvable: e. g., for Fischer, one knows the correctness constraint independently of the number of processes, but tools may fail to compute it. This is also the case of the toy PTAs in Figs. 1(a) and 1(b).
Despite time-out, some case studies come with a partial result: either because IMITATOR is running reachability-synthesis (“EFsynth” [JLR15]) which can output a partial result when interrupted before completion, or because some other methods can output some valuations. For example, for ProdCons, IMITATOR is unable to synthesize a constraint; however, in the original work [KP12], some punctual valuations (non-symbolic) are given.
Robustness case studies are not part of Table 1, but are included in the online library.
4 Perspectives
Syntax
So far, all benchmarks use the IMITATOR input format; in addition, only if the benchmark comes from another model checker (e. g., a HyTech or Uppaal model), it also comes with its native syntax. In a near future, we plan to propose a translation to Uppaal timed automata; however, some information will be lost as Uppaal does not allow parameters, and supports stopwatches in a limited manner. A future work will be to propose other syntaxes, or a normalized syntax for parametric timed model checking benchmarks.
Contributions and versioning
The library is aimed at being enriched with future benchmarks. Furthermore, it is collaborative, and is open to any willing contributor. A versioning system will be set up with the addition (or modification) of benchmarks in the future.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ABBL 98] Luca Aceto, Patricia Bouyer, Augusto Burgueño, and Kim Guldstrand Larsen. The power of reachability testing for timed automata. In Vikraman Arvind and Ramaswamy Ramanujam, editors, Proceedings of the 18th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 1998) , volume 1530 of Lecture Notes in Computer Science , pages 245–256. Springer, 1998.
- 2[ACD + 09] Étienne André, Thomas Chatain, Olivier De Smet, Laurent Fribourg, and Silvain Ruel. Synthèse de contraintes temporisées pour une architecture d’automatisation en réseau. In Didier Lime and Olivier H. Roux, editors, Actes du 7ème colloque sur la modélisation des systèmes réactifs (MSR 2009) , volume 43 of Journal Européen des Systèmes Automatisés , pages 1049–1064. Hermès, November 2009.
- 3[ACEF 09] Étienne André, Thomas Chatain, Emmanuelle Encrenaz, and Laurent Fribourg. An inverse method for parametric timed automata. International Journal of Foundations of Computer Science , 20(5):819–836, October 2009.
- 4[ACN 15] Étienne André, Camille Coti, and Hoang Gia Nguyen. Enhanced distributed behavioral cartography of parametric timed automata. In Michael Butler, Sylvain Conchon, and Fatiha Zaïdi, editors, Proceedings of the 17th International Conference on Formal Engineering Methods (ICFEM 2015) , volume 9407 of Lecture Notes in Computer Science , pages 319–335. Springer, November 2015.
- 5[AFKS 12] Étienne André, Laurent Fribourg, Ulrich Kühne, and Romain Soulat. IMITATOR 2.5: A tool for analyzing robustness in scheduling problems. In Dimitra Giannakopoulou and Dominique Méry, editors, Proceedings of the 18th International Symposium on Formal Methods (FM 2012) , volume 7436 of Lecture Notes in Computer Science , pages 33–36. Springer, August 2012.
- 6[AHV 93] Rajeev Alur, Thomas A. Henzinger, and Moshe Y. Vardi. Parametric real-time reasoning. In S. Rao Kosaraju, David S. Johnson, and Alok Aggarwal, editors, Proceedings of the twenty-fifth annual ACM symposium on Theory of computing (STOC 1993) , pages 592–601, New York, NY, USA, 1993. ACM.
- 7[AHW 18] Étienne André, Ichiro Hasuo, and Masaki Waga. Offline timed pattern matching under uncertainty. In Anthony Widjaja Lin and Jun Sun, editors, Proceedings of the 23rd International Conference on Engineering of Complex Computer Systems (ICECCS 2018) . IEEE, 2018. To appear.
- 8[AL 17] Étienne André and Shang-Wei Lin. Learning-based compositional parameter synthesis for event-recording automata. In Ahmed Bouajjani and Silva Alexandra, editors, Proceedings of the 37th IFIP WG 6.1 International Conference on Formal Techniques for Distributed Objects, Components, and Systems (FORTE 2017) , volume 10321 of Lecture Notes in Computer Science , pages 17–32. Springer, 2017.
