Extracting Entities and Topics from News and Connecting Criminal Records
Quang Pham, Marija Stanojevic, Zoran Obradovic

TL;DR
This paper applies data science techniques to extract entities and topics from criminal records and news articles, aiming to analyze crime data, identify patterns, and create dynamic crime graphs for better understanding of U.S. crimes.
Contribution
It introduces a methodology combining statistical and natural language processing methods to analyze large datasets of criminal records and news articles, enabling crime clustering and visualization.
Findings
Successful extraction of entities and topics from datasets
Effective clustering of crime violations by type
Creation of a dynamic crime graph over time
Abstract
The goal of this paper is to summarize methodologies used in extracting entities and topics from a database of criminal records and from a database of newspapers. Statistical models had successfully been used in studying the topics of roughly 300,000 New York Times articles. In addition, these models had also been used to successfully analyze entities related to people, organizations, and places (D Newman, 2006). Additionally, analytical approaches, especially in hotspot mapping, were used in some researches with an aim to predict crime locations and circumstances in the future, and those approaches had been tested quite successfully (S Chainey, 2008). Based on the two above notions, this research was performed with the intention to apply data science techniques in analyzing a big amount of data, selecting valuable intelligence, clustering violations depending on their types of crime,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital and Cyber Forensics · Crime Patterns and Interventions · Data Quality and Management
