Forecasting Future International Events: A Reliable Dataset for   Text-Based Event Modeling

Daehoon Gwak; Junwoo Park; Minho Park; Chaehun Park; Hyunchan Lee,; Edward Choi; Jaegul Choo

arXiv:2411.14042·cs.CL·November 22, 2024

Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling

Daehoon Gwak, Junwoo Park, Minho Park, Chaehun Park, Hyunchan Lee,, Edward Choi, Jaegul Choo

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces WORLDREP, a high-quality dataset for predicting international events from text, created using large-language models and expert validation to improve research in geopolitics and policy decision-making.

Contribution

The paper presents a novel, rigorously validated dataset for text-based event prediction, enhancing data quality and supporting advanced research in international event forecasting.

Findings

01

WORLDREP improves event prediction accuracy

02

The dataset is validated by domain experts

03

Open-source tools facilitate future research

Abstract

Predicting future international events from textual information, such as news articles, has tremendous potential for applications in global policy, strategic decision-making, and geopolitics. However, existing datasets available for this task are often limited in quality, hindering the progress of related research. In this paper, we introduce WORLDREP (WORLD Relationship and Event Prediction), a novel dataset designed to address these limitations by leveraging the advanced reasoning capabilities of large-language models (LLMs). Our dataset features high-quality scoring labels generated through advanced prompt modeling and rigorously validated by domain experts in political science. We showcase the quality and utility of WORLDREP for real-world event prediction tasks, demonstrating its effectiveness through extensive experiments and analysis. Furthermore, we publicly release our dataset…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eogns282/worldrep
noneOfficial

Datasets

Daehoon/WORLDREP
dataset· 25 dl
25 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques