Predicting the Results of LTL Model Checking using Multiple Machine   Learning Algorithms

Weijun Zhu; Mingliang Xu; Jianwei Wang

arXiv:1901.07891·cs.LO·February 22, 2019

Predicting the Results of LTL Model Checking using Multiple Machine Learning Algorithms

Weijun Zhu, Mingliang Xu, Jianwei Wang

PDF

Open Access

TL;DR

This paper explores machine learning techniques like RF, KNN, DT, and LR to accurately predict LTL model checking results, significantly improving efficiency over traditional methods.

Contribution

It introduces a novel approach using multiple ML algorithms to predict LTL model checking outcomes with high accuracy and efficiency.

Findings

01

Random Forest achieved 97.9% accuracy

02

K-Nearest Neighbors achieved 98.2% accuracy

03

ML-based approaches vastly outperformed existing methods in computation efficiency

Abstract

In this paper, we study how to predict the results of LTL model checking using some machine learning algorithms. Some Kripke structures and LTL formulas and their model checking results are made up data set. The approaches based on the Random Forest (RF), K-Nearest Neighbors (KNN), Decision tree (DT), and Logistic Regression (LR) are used to training and prediction. The experiment results show that the predictive accuracy of the RF, KNN, DT and LR-based approaches are 97.9%, 98.2%, 97.1% and 98.2%, respectively, as well as the average computation efficiencies of the RF, KNN, DT and LR-based approaches are 7102500, 598, 4132364 and 5543415 times than that of the existing approach, respectively, if the length of each LTL formula is 500.

Figures10

Click any figure to enlarge with its caption.

Tables4

Table 1. Table 1: ML training and testing results in Graph Lab, where the length of each formula is 25

ML Algorithms	RF	KNN	DT	LR
Training record #	355	361	356	355
Testing record #	50	44	49	50
Running time per record (in second)	0.000031	0.001352	0.000044	0.000046
Prediction Accuracy	0.9800	0.9773	0.9796	0.9800
AUC	0.9991	0.5540	0.9706	0.9854
Seed #	1243	1150	102	691
Fraction	0.88	0.88	0.88	0.88

Table 2. Table 2: The improvement of the efficiency by our new methods, where the length of each formula is 25

ML algorithm	The average time for LTL model checking of one pair of Kripke structure and formula, i.e., $t_{1}$ (in Seconds)	The average time for predicting the result of model checking, i.e., $t_{2}$ (in Seconds)	$t_{2}$ / $t_{1}$	$t_{1}$ / $t_{2}$
RF	0.015	0.000031	0.2%	484
KNN	0.015	0.001352	9%	11
DT	0.015	0.000044	0.3%	341
LR	0.015	0.000046	0.3%	326

Table 3. Table 3: ML training and testing results in Graph Lab, where the length of each formula is 500

ML Algorithms	RF	KNN	DT	LR
Training record #	353	344	365	344
Testing record #	47	56	35	56
Running time per record (in second)	0.000032	0.380051	0.000055	0.000041
Prediction Accuracy	0.9787	0.9821	0.9714	0.9821
AUC	0.9850	0.1387	0.9583	0.9919
Seed #	858	429	732	2306
Fraction	0.88	0.86	0.90	0.86

Table 4. Table 4: The improvement of the efficiency by our new methods, where the length of each formula is 500

ML algorithm	The average time for LTL model checking of one pair of Kripke structure and formula, i.e., $t_{1}$ (in Seconds)	The average time for predicting the result of model checking, i.e., $t_{2}$ (in Seconds)	$t_{2}$ / $t_{1}$	$t_{1}$ / $t_{2}$
RF	227.28	0.000032	0.000014%	7102500
KNN	227.28	0.380051	0.2%	598
DT	227.28	0.000055	0.000024%	4132364
LR	227.28	0.000041	0.000018%	5543415

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Software Testing and Debugging Techniques · Machine Learning and Data Classification

Full text

Predicting the Results of LTL Model Checking using Multiple Machine Learning Algorithms

Weijun ZHU, Mingliang XU and Jianwei WANG

School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China

Abstract: In this paper, we study how to predict the results of LTL model checking using some machine learning algorithms. Some Kripke structures and LTL formulas and their model checking results are made up data set. The approaches based on the Random Forest (RF), K-Nearest Neighbors (KNN), Decision tree (DT), and Logistic Regression (LR) are used to training and prediction. The experiment results show that the predictive accuracy of the RF, KNN, DT and LR-based approaches are 97.9%, 98.2%, 97.1% and 98.2%, respectively, as well as the average computation efficiencies of the RF, KNN, DT and LR-based approaches are 7102500, 598, 4132364 and 5543415 times than that of the existing approach, respectively, if the length of each LTL formula is 500.

Keywords: Machine Learning; Model Checking; Linear Temporal Logic; Kripke Structure

1 Introduction

Model checking is a kind of formal verifying technique, which was presented by Turing Award winner Prof. Clarke et al[1]. This technique is widely used in CPU design[2], security protocols [3] and malware detection[4] by some leading IT companies, including INTEL and IBM[5]. And the state explosion problem (In a special case, $10^{120}$ states were verified automatically by the symbolic model checker[6].) has been widely concerned. In our previous work, we used the BT algorithm to predict the results of LTL model checking[7]. Can other machine learning algorithms be used to predict the results of LTL model checking effectively? This is our research issue.

2 The principle of the new method

The principle of the new method can be described as Fig. 1.

3 Simulated experiments

3.1 The simulation platform

(1)CPU: Intel(R) Core(TM) i7-4790 CPU @3.60GHz.

(2)RAM: 8.0G RAM.

(3)OS: Windows 10.

(4)NuSMV[8]: for performing LTL model checking.

(5)Graph Lab[9]: for implementing the RF[10][11], DT[12][13], KNN[14][15] and LR[16][17] algorithm.

3.2 Experimental Procedures

The experimental steps in this paper are consistent with the experimental steps in our previous work, and will not be described repeatedly here. For details, please refer to [7].

3.3 Experimental results

3.3.1 NuSMV experiments

The data sets obtained in this section are raw data for Grap Lab experiments in the next section, where the length of each formula is 25 or 500. The NuSMV experiments in this study is the same as that in our previous work. For details, please refer to [7].

3.3.2 Experiments via Graph Lab

In the NuSMV experiments, we obtained 405 and 400 records of model checking experimental results via NuSMV, where the length of each formula is 25 and 500, respectively. The two data sets are then exported to Graph Lab to conduct the training and prediction using one of the machine learning algorithms based on RF, KNN, DT, and LR, respectively.

The optimal results are summarized in Fig. 2 for the four machine learning algorithms, i.e., RF, DT, KNN and LR, where the length of each formula is 25. Table 1 shows the value of parameters and some performance when these optimal results occur. Furthermore, the results also indicate that the efficiency of the approaches based on the RF, KNN, DT, LR increase 484, 11, 341, 326 times respectively, compared with the traditional NuSMV model checking approach, as shown in table 2.

The optimal results are summarized in Fig. 3 for the four machine learning algorithms, i.e., RF, DT, KNN and LR, where the length of each formula is 500. Table 3 shows the value of parameters and some performance when these optimal results occur. Furthermore, the results also indicate that the efficiency of the approaches based on the RF, KNN, DT, LR increase 7102500, 598, 4132364, 5543415 times respectively, compared with the traditional NuSMV model checking approach, as shown in table 4.

4 Conclusions

In this study, we propose a method based on four machine learning algorithms, for predicting the results of LTL model checking. The results show that the longer the formulas involved in the model checking become, the more obvious the comparative advantage of the new method is.

Acknowledgements

This work has been supported by the National Natural Science Foundation of China (No.U1204608).

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Clarke E, et a 1. Model Checking. Massachusetts: MIT Press, 1999.
2[2] Barnat J, Bauch P, Brim L, et a 1. Designing fast LTL model checking algorithms for many-core GP Us. Journal of Paralleland Distributed Computing, 2012, 72(9): 1083-1097.
3[3] Carbone R. LTL model-checking for security protocols. AI Communications, 2011, 24(4): 281-283.
4[4] C Song F, Touili T. Model-Checking for Android Malware Detection, Programming Languages and Systems. Springer International Publishing, 2014:216-235.
5[5] Clarke E. Grand challenge problem: Model check concurrent software. http://www.csl.sri.com/users/shankar/VGC 05/Clarke.ppt, 2018.
6[6] Burch J R, Clarke E M, Long D E, et al. Symbolic Model Checking for Sequential Circuit Verification. IEEE Trans. Comput. -Aided Des. Integrated Circuits & Syst. 1994, 13(4):401-424.
7[7] Zhu W, Ban S, Fan Y. Approximate LTL model checking. ar Xiv preprint ar Xiv:1805.11765, 2018.
8[8] Nu SMV home page. Nu SMV: a new symbolic model checker. Available: http://nusmv.fbk.eu/, 2018.