This repository contains the Open States data model and scraper backend.
See RELASE.md
openstates-core data models may occasionally change, requiring changes to the Open States database in production. Here are the steps to create and execute a new migration:
- Follow the steps in Running a Local Database doc to get a copy of the database running locally with the prior schema. It is important to develop and test your migration locally before executing it on the production DB.
- In most cases, you will auto-generate a migration by first modifying the model file. Make a change to one or more of
the files in
openstates/data/models. - Test the migration locally:
- Identify the DB connection URL that is accurate to your local database. In most cases, it should
be:
postgis://openstates:openstates@localhost:5405/openstatesorg - In the repo's root folder, run the
os-dbmakemigrationscommand to generate a migration file based on your changes:DATABASE_URL=postgis://openstates:openstates@localhost:5405/openstatesorg poetry run os-dbmakemigrations - Look at the generated migration file in
openstates/data/migrationsand ensure that it looks correct. - Execute the migration by running the
os-initdbcommand:DATABASE_URL=postgis://openstates:openstates@localhost:5405/openstatesorg poetry run os-initdb
- Identify the DB connection URL that is accurate to your local database. In most cases, it should
be:
- Once the migration is verified by local testing, you can execute it against the production DB
- Identify the DB connection URL that is accurate for the PROD database. This typically should use the same Postgres user that owns the tables you want to change. Contact an admin if you need help.
- Run the
os-initdbcommand to migrate:DATABASE_URL=postgis://USERNAME_HERE:PASSWORD_HERE@PROD_DB_HOSTNAME_HERE:5432/openstatesorg poetry run os-initdb
- Install pyenv and correct python version
- Install poetry
poetry install
Example PyCharm config (for relationships CLI command):
- Interpeter: the poetry env that you just set up
- Module:
openstates.cli.relationships - Parameters:
--log_level=DEBUG us - Env vars:
DATABASE_URL=postgres://USERNAME:PASSWORD@DB_HOSTNAME:PORT/openstatesorg
There are
instructions on running a scraper here
but what if you want to debug the command that wraps around the scraper code? This
repository's update CLI module
is the code that accepts parameters and actually executes the scraper (from
openstates-scrapers). Another wrinkle is that the update command
needs to be run inside the context of the openstates-scrapers repository, as it will attempt to load in the relevant
scraper for the jurisdiction requested, and that import will fail if you try to run the code here
within openstates-core.
Here's a recipe using PyCharm to successfully debug the update command:
- You need the
gdallibrary installed on the host system. For me:sudo apt install gdal-bin python3-gdal openstates-corechecked out at /home/username/repo/openstates/openstates-core/- (let's assume you have made some changes in
openstates-corethat you want to test) openstates-scraperschecked out /home/username/repo/openstates/openstates-scrapers/- Change directory to /home/username/repo/openstates/openstates-scrapers/
- Install required python version using the
pyenvutility pip install poetry(if that python version doesn't already have it)
- If you have previously installed the
openstatesdependency (egopenstates-core), then you need to runpoetry remove openstatesto clear that remotely-installed (from pypi) dependency. Each time you make a round of changes toopenstates-core, you will need to remove and then re-add the dependency. poetry add ../openstates-core/will add theopenstatesdependency from your local filesystem/local checkoutpoetry install- In PyCharm, open the
openstates-scrapersfolder - In PyCharm, set up a new run config:
- type: python
- module: openstates.cli.update
- parameters: vi bills (or whatever you want to run)
- working directory: /home/username/repo/openstates/openstates-scrapers/scrapers
- Run the run config in debug mode within PyCharm (eg the one working on
openstates-scrapers). You can set breakpoints within both the scraper code AND in theopenstates-corecode. However you need to open (in PyCharm) the copy ofopenstates-corethat poetry installed, which probably is in a location like:/home/username/.cache/pypoetry/virtualenvs/openstates-scrapers-93BMrPXy-py3.9/lib/python3.9/site-packages/openstates/cli/update.py
Reminder: when further changes are made to the openstates-core package locally, and you want to debug them again,
you need to remove/re-add to update files in openstates-scrapers
- Change directory to
/home/username/repo/openstates/openstates-scrapers/ poetry remove openstatespoetry add ../openstates-core/- (you will need to re-establish breakpoints in any openstates-core files)