- Fill in missing data via imputation
- Train and compare models based on your data
- Save a model to produce daily or batch predictions
- Write daily or batch predictions back to a database or csv file
- Learn which factors drive each prediction
- If you haven't, install 64-bit Python 3.6 via the Anaconda distribution
- Open the terminal (i.e., CMD or PowerShell, if using Windows)
- Optional If you intend to work with MSSQL databases, run
conda install pyodbc
- Upgrade to latest scipy
- Run
conda remove scipy
- Run
conda install scipy
- Run
conda install scikit-learn
- Install healthcareai using one and only one of these three methods (ordered from easiest to hardest).
- Recommended: Install the latest release with pip run
pip install healthcareai
- If you know what you're doing, and instead want the bleeding-edge version direct from our github repo, run
pip install https://github.com/HealthCatalyst/healthcareai-py/zipball/master
- Recommended: Install the latest release with pip run
We recommend using the Anaconda python distribution when working on Windows. There are a number of reasons:
- When running anaconda and installing packages using the
conda
command, you don't need to worry about dependency hell, particularly because packages aren't compiled on your machine;conda
installs pre-compiled binaries. - A great example of the pain the using
conda
saves you is with the python package scipy, which, by their own admission "is difficult"
You may need to install the following dependencies:
-
sudo apt-get install python-tk
-
Optional If you intend to work with MSSQL databases, run
sudo pip install pyodbc
- Note you'll might run into trouble with the
pyodbc
dependency. You may first need to runsudo apt-get install unixodbc-dev
then retrysudo pip install pyodbc
. Credit stackoverflow
- Note you'll might run into trouble with the
-
Once you have the dependencies satisfied install healthcareai using one and only one of these three methods (ordered from easiest to hardest).
- Recommended: Install the latest release with pip run
pip install healthcareai
or orsudo pip install healthcareai
- If you know what you're doing, and instead want the bleeding-edge version direct from our github repo, run
pip install https://github.com/HealthCatalyst/healthcareai-py/zipball/master
- Recommended: Install the latest release with pip run
- Install healthcareai using one and only one of these three methods (ordered from easiest to hardest).
- Recommended: Install the latest release with pip run
pip install healthcareai
or orsudo pip install healthcareai
- If you know what you're doing, and instead want the bleeding-edge version direct from our github repo, run
pip install https://github.com/HealthCatalyst/healthcareai-py/zipball/master
- Recommended: Install the latest release with pip run
After running healthcareai, you may see an error about python not being installed as a framework. The cause of this is truly that the default rendering back end is Cocoa. Simply change matplotlib's rendering engine as follows (Credit GitHub user):
I assume you have installed the pip matplotlib, there is a directory in you root called
/.matplotlib. Create a file ``/.matplotlib/matplotlibrcthere and add the following code:
backend: TkAgg`
- Install docker
- Clone this repo (look for the green button on the repo main page)
- cd into the cloned directory
- run
docker build -t healthcareai .
- run the docker instance with
docker run -p 8888:8888 healthcareai
- You should then have a jupyter notebook available on
http://localhost:8888
.
To verify that healthcareai installed correctly:
- Open a terminal and run
python
oripython
. Either of these opens an interactive python console (also known as a REPL). - Then enter this command:
from healthcareai import SupervisedModelTrainer
and hit enter. If no error is thrown, you are ready to rock.
If you did get an error, or run into other installation issues, please let us know or better yet post on Stack Overflow (with the healthcare-ai tag) so we can help others along this process.
- Read through the docs on this site.
- Start with either
example_regression_1.py
orexample_classification_1.py
using our sample diabetes dataset. - Modify the queries and parameters to match your data.
- Decide on what kind of prediction output you want.
- Set up your database tables to match the output schema. See the prediction types document for details.
- If you are working in a Health Catalyst EDW ecosystem (primarily MSSQL), please see the Catalyst EDW Instructions for SAM setup.
- Please see the databases docs for details about writing to different databases (MSSQL, MySQL, SQLite, CSV)
- Double check that the code follows the examples in these documents.
- If you're still seeing an error, file an issue on Stack Overflow using the healthcare-ai tag. Please provide
- Details on your environment (OS, database type, R vs Py)
- Goals (ie, what are you trying to accomplish)
- Crystal clear steps to reproduce the error