Presidential data & analysis

The scrapers are all located in presidents/scraping/*.py.

See Sources for documentation on where and how the data was collected.
The substance of the analysis exists as Jupyter notebooks in notebooks/.

Installation

These instructions assume you already have Python 3 installed.

The scrapers rely on the requests, BeautifulSoup4, and python-dateutil libraries (among others), with the Jupyter notebooks requiring many more.

You'll need to install a few non-Python dependencies on your system:

On macOS with Homebrew:

brew install cairo py3cairo jpeg jpeg-turbo node

On CentOS 7:

yum -y groupinstall development
yum -y install cairo-devel cairo-tools
yum -y install libjpeg-turbo libjpeg-turbo-devel
yum -y install nodejs nodejs-devel
yum -y install https://downloads.sourceforge.net/project/mscorefonts2/rpms/msttcore-fonts-installer-2.6-1.noarch.rpm
fc-cache /usr/share/fonts/msttcore

Once you have Node.js & npm installed:

npm install -g vega vega-lite

Python dependencies and environment

It's recommended to sandbox everything into a virtualenv, which makes installation more predictable / reliable:

sudo pip install -U virtualenv
virtualenv ~/presidents-venv
source ~/presidents-venv/bin/activate

Now that you've activated the virtualenv, get the code:

git clone https://github.com/chbrown/presidents ~/presidents
cd ~/presidents

To install the required Python libraries:

pip install -r requirements.txt

To install the language model for spaCy:

python -m spacy download en_core_web_md

To prepare the data for reading from notebooks:

make data/tapp/all.local-cache.json

Get the LIWC 2007 dictionary (see liwc-python for details) and move it to:

/usr/local/data/liwc_2007.dic

To start a Jupyter notebook server:

export PYTHONPATH=~/presidents
jupyter notebook notebooks/

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
data		data
docs		docs
notebooks		notebooks
presidents		presidents
stopwords		stopwords
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Presidential data & analysis

Installation

Python dependencies and environment

License

About

Releases

Packages

Languages

carriechen/presidents

Folders and files

Latest commit

History

Repository files navigation

Presidential data & analysis

Installation

Python dependencies and environment

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages