- Download the movies dataset or the ml-100k dataset and extract it. The movies dataset is a bit bigger and slower, but contains more movies and ratings and newer movies.
- Setup, install and activate virtual environment
python3 -m venv .venv pip install -r requirements.txt source .venv/bin/activate # Linux only, Windows is different
- Run the data cleanse script
python movie_rec_api/data-cleanse.py /path/to/the-extracted-dataset out
- Start the server
uvicorn movie_rec_api.main:app