This repo provides the dataset introduced by our EMNLP 2023 paper "NormDial: A Comparable Bilingual Synthetic Dialogue Dataset for Modeling Social Norm Adherence and Violation".(https://aclanthology.org/2023.emnlp-main.974.pdf)
The repo contains three folders
- Normdial_code: code base for generating dialogues and labels
- NormDial_raw_dataset: the raw dialogue outputs from the NormDial pipeline
- NormDial_clean_dataset: a manual cleaned version of NormDial_raw_dataset (Ready for use). Under each category folder, access cleaned_dialogue.csv for the cleaned dataset