A Statistical Analysis of Recent Traffic Crashes in Massachusetts
Aaron Zhang, Evan W. Patton, Justin M. Swaney, and Tingying Helen Zeng

TL;DR
This study uses statistical methods and machine learning to analyze Massachusetts traffic crash data, identifying key factors like weather, driver age, and intersection danger that increase injury risk.
Contribution
It introduces a binary classification model using logistic regression with feature selection to identify injury-causing crash circumstances.
Findings
Weather and road conditions significantly increase injury risk.
Senior and teen drivers are more involved in injury crashes.
Dangerous intersections are major injury risk factors.
Abstract
A statistical analysis implemented in the Python programming language was performed on the available MassDOT car accident data to identify whether a certain set of traffic circumstances would increase the likelihood of injuries. In the analysis, we created a binary classifier as a model to separate crashes that resulted in injury from those that did not. To accomplish this, we first cleaned up the initial data, then proceeded to represent categorical variables numerically through one hot encoding before finally producing models with Recursive Feature Elimination (RFE) and without RFE, in conjunction with logistic regression. This statistical analysis plays a significant role in our modern road network that has presented us with a heap of obstacles, one of the most critical being the issue of how we can ensure the safety of all drivers and passengers. Findings from our analysis identify…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic Prediction and Management Techniques
