WORLDREP (WORLD Relationship and Event Prediction) is a high-quality dataset designed for predicting future international events based on textual information, such as news articles. It provides the relationships between countries with numerical scores ranging from 0.0 (cooperation) to 1.0 (conflict).
This dataset is introduced and detailed in the following paper:
Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling
The WORLDREP dataset is publicly available on Hugging Face:
The dataset features the following columns:
Column | Description |
---|---|
EventID |
Unique identifier for the event |
SourceURL |
URL of the news article reporting the event |
DATE |
Publication date of the article in YYYYMMDDHHMMSS format |
CONTENT |
Content of the news article |
Country1 |
The first country involved in the event |
Country2 |
The second country involved in the event |
Score |
Numerical value (0.0-1.0) representing the relationship between countries. A score close to 0.0 indicates cooperation, while a score close to 1.0 indicates conflict. |
The code for data preprocessing, analysis, and usage will be released soon.
If you use WORLDREP in your research, please cite the following paper:
@inproceedings{gwak2024worldrep,
title={Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling},
author={Daehoon Gwak, Junwoo Park, Minho Park, Chaehun Park, Hyunchan Lee, Edward Choi and Jaegul Choo},
booktitle={EMNLP Findings},
year={2024}
}
- ๐ Paper on arXiv
- ๐ Dataset on Hugging Face
If you encounter any issues or have feature requests, feel free to open an issue or a pull request.
This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).