Skip to content
This repository has been archived by the owner on Sep 3, 2023. It is now read-only.

Commit

Permalink
draft for data statement docs for #64
Browse files Browse the repository at this point in the history
  • Loading branch information
Andrada Pumnea committed Mar 8, 2020
1 parent d843253 commit f152812
Showing 1 changed file with 30 additions and 0 deletions.
30 changes: 30 additions & 0 deletions data/data_statement.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
We are currently relying on 3 datasets for our research and modeling efforts:

1. Waseem, Zeerak, and Dirk Hovy. "Hateful symbols or hateful people? predictive features for hate speech detection on
twitter." Proceedings of the NAACL student research workshop. 2016. (check
[Hate speech](#Hate-speech) section)

2. Anzovino, Maria, Elisabetta Fersini, and Paolo Rosso. "Automatic identification and classification of misogynistic
language on twitter." International Conference on Applications of Natural Language to Information Systems.
Springer, Cham, 2018. (check [Automatic Misogyny Identification](#Automatic-Misogyny-Identification ) section)

3. A dataset that we collected and labeled. Check [Our Annotations](#Our-Annotations) section for a full description
of our process.


These 3 datasets are combined into what we call the **gold dataset**.

The next 3 sections provide an overview of how the data was collected and labeled in the form of data statements
([Bender, Emily M., and Batya Friedman](https://www.aclweb.org/anthology/Q18-1041/))

# Hate speech
to-do

# Automatic Misogyny Identification
to-do

# Our Annotations
to-do



0 comments on commit f152812

Please sign in to comment.