Crime prediction through urban metrics and statistical learning
Luiz G A Alves, Haroldo V Ribeiro, Francisco A Rodrigues

TL;DR
This paper employs a random forest model to predict crime based on urban metrics, achieving high accuracy and identifying key indicators like unemployment and illiteracy as significant factors influencing homicides in Brazilian cities.
Contribution
It introduces a machine learning approach that robustly ranks urban indicators' influence on crime, overcoming issues of multicollinearity and non-Gaussian distributions in data.
Findings
Random forest achieves up to 97% accuracy in crime prediction.
Urban indicators like unemployment and illiteracy are the most influential.
Indicators are ranked and clustered, showing robustness under data variations.
Abstract
Understanding the causes of crime is a longstanding issue in researcher's agenda. While it is a hard task to extract causality from data, several linear models have been proposed to predict crime through the existing correlations between crime and urban metrics. However, because of non-Gaussian distributions and multicollinearity in urban indicators, it is common to find controversial conclusions about the influence of some urban indicators on crime. Machine learning ensemble-based algorithms can handle well such problems. Here, we use a random forest regressor to predict crime and quantify the influence of urban indicators on homicides. Our approach can have up to 97% of accuracy on crime prediction, and the importance of urban indicators is ranked and clustered in groups of equal influence, which are robust under slightly changes in the data sample analyzed. Our results determine the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
