Skip to content

Latest commit

 

History

History
25 lines (16 loc) · 2.19 KB

publish-data.md

File metadata and controls

25 lines (16 loc) · 2.19 KB

Publish data

If your data is already online

If you have a link to published data, that's great! You can submit it to our database where it will show up in searches. We'll trigger automatic archives, and people who subscribe to data in the area will be notified.

If you need to publish data first

If you have public records or data you want to share more broadly, there are a few ways to go! Make sure to share your published data when you're done.

Options for sharing data

{% hint style="warning" %} If this is overwhelming, don't worry—every case is different, and we can provide advice. Email [email protected] for a free consultation. {% endhint %}

  • GitHub is a free platform intended for storing open-source code. It can be a great choice for sharing a small dataset, especially if that comes with a Python notebook of analysis or a web scraper which generated the data. Here is a demo of an automated web scraper which collects Oakland's Calls for Service data, for free.
  • JKAN (for smaller agencies) and CKAN (for larger ones) are two open-source data portals which need to be set up by someone with some technical skills.
  • DocumentCloud can be used for publishing PDFs, and has additional tools for recognizing text (OCR) and annotation. You will need to request verification to publish documents.
  • Your town or region may have an open data portal (here's an example from Pittsburgh area) which may be interested in publishing the data you have.
  • A service like Dropbox can be used to share folders of documents in bulk, with fewer tools for keeping them organized or labeled.

Making published data accessible

Sometimes, published data is not really accessible—maybe the data is rendered as a visualization on a dashboard, or you want to calculate a sum or average value from a 250-page HTML table. That's where web scraping comes in.