Machine Learning Classification of Peaceful Countries: A Comparative Analysis and Dataset Optimization
K. Lian (1), L. S. Liebovitch (1), M. Wild (1), H. West (1), P. T., Coleman (1), F. Chen (2), E. Kimani (2), K. Sieck (2) ((1) Columbia, University, (2) Toyota Research Institute)

TL;DR
This paper develops a machine learning model using media-derived linguistic features to classify countries as peaceful or non-peaceful, analyzing how dataset size affects accuracy in peace studies.
Contribution
It introduces a novel approach combining vector embeddings and cosine similarity for peace classification and examines dataset size effects on model performance.
Findings
Model effectively classifies countries as peaceful or non-peaceful.
Dataset size significantly impacts classification accuracy.
Large-scale text data presents both challenges and opportunities for peace research.
Abstract
This paper presents a machine learning approach to classify countries as peaceful or non-peaceful using linguistic patterns extracted from global media articles. We employ vector embeddings and cosine similarity to develop a supervised classification model that effectively identifies peaceful countries. Additionally, we explore the impact of dataset size on model performance, investigating how shrinking the dataset influences classification accuracy. Our results highlight the challenges and opportunities associated with using large-scale text data for peace studies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPolitics and Conflicts in Afghanistan, Pakistan, and Middle East · Environmental and Biological Research in Conflict Zones · Terrorism, Counterterrorism, and Political Violence
