Pigeon is HackMIT's RAG email assistant. Pigeon helps automates the help email workflow.
Pigeon uses Flask for its backend framework and React for its frontend framework. It is built on an in-memory Redis database that stores embedding data for documents and emails, along with Postgres for longer-term database management. The emailing service is automated with AWS, see below for details.
pigeon/
βββ .devcontainer # Dev container configuration
βββ README.md
βββ requirements.txt # Python dependencies list for backend
βββ .env # Stores secrets that are not in VCS
βββ wsgi.py # Flask entry point
βββ scripts
β βββ deploy.sh # Deploy to production server
β βββ devcontainer_setup.sh # Run once when the dev container initializes
βββ client
βΒ Β βββ public
βΒ Β βΒ Β βββ pigeon.png
βΒ Β βββ src
βΒ Β βΒ Β βββ App.tsx
βΒ Β βΒ Β βββ main.css
βΒ Β βΒ Β βββ main.tsx
βΒ Β βΒ Β βββ routes
βΒ Β βΒ Β βΒ Β βββ documents.module.css
βΒ Β βΒ Β βΒ Β βββ documents.tsx
βΒ Β βΒ Β βΒ Β βββ inbox.module.css
βΒ Β βΒ Β βΒ Β βββ inbox.tsx
βΒ Β βΒ Β βΒ Β βββ index.module.css
βΒ Β βΒ Β βΒ Β βββ index.tsx
βΒ Β βΒ Β βββ shell.tsx
βΒ Β βΒ Β βββ vite-env.d.ts
βββ server
Β Β βββ config.py
Β Β βββ controllers
Β Β βΒ Β βββ admin.py
Β Β βΒ Β βββ emails.py
Β Β βΒ Β βββ faq.py
Β Β βββ email_template
Β Β βΒ Β βββ template.html
Β Β βββ models
Β Β βΒ Β βββ document.py
Β Β βΒ Β βββ email.py
Β Β βΒ Β βββ response.py
Β Β βΒ Β βββ thread.py
Β Β βββ nlp
Β Β Β Β βββ corpus_harvard.json
Β Β Β Β βββ corpus_mit.json
Β Β Β Β βββ embeddings.json
Β Β Β Β βββ embeddings.py
Β Β Β Β βββ responses.py
Docker. See setup instructions below.
Copy .env.sample
to .env
and fill in the necessary values. You should be able to find them on slack.
Using Dev Containers are strongly recommended when doing development work on pigeon. Containers are a way of provisioning identical environments from computer to computer and alleviates a lot of headache when it comes to installing dependencies, setting up Postgres, etc...
To use Dev Containers you must first install Visual Studio Code and Docker. Then you must install the Remote Containers extension for Visual Studio Code.
To use Docker, install it here. To check if your installation is working, try running
docker run hello-world
If you get a message on your screen that starts with "Hello from Docker!", your installation is working.
After that, follow this tutorial to get your environment set up. Make sure you open this repository as a dev container in VSCode.
Note: It can take a few minutes to provision the container if it is your first time starting it up.
To start the server, run
python3 wsgi.py
To start the client, in a different terminal, run
cd client
npm run dev
The postgres and redis services are running on the same network as the dev container, but they can only communicate with each other via the designated service names, which are database
and redis
respectively. If you want to view these services from inside the dev container, you can use the following commands:
# postgres
PGPASSWORD='password' psql -h database -U postgres
# redis
redis-cli -h redis
Check redis keys with
keys *
Alternatively, you can access these services from your local machine, i.e., outside of the dev container, by connecting directly to the docker containers. To do this, run
docker container ls
and retrieve the container id of your desired instance (e.g., pigeon_devcontainer-database-1
). Then, run
docker exec -it <container_id> /bin/bash
to enter the container, from which you should be able to run psql
or redis-cli
directly.
Put all direct dependencies (i.e., packages we directly import) in requirements.in
. pigar can be used to automate part of this process. Then, run
pip-compile
to generate requirements.txt, which contains pinned versions of subdependencies as well.
All emails are forwarded to Pigeon through AWS. More specifically, emails are received with a receipt rule and forwarded to an S3 bucket, which are then processed and forwarded to the api with a lambda. The receiving and sending rules are both handled by SES.
For instructions on setting up locally, see go/pigeon-aws.