Skip to content

Commit

Permalink
9 add notebook demo (#10)
Browse files Browse the repository at this point in the history
* Update API

* Update index

* Fix repo_url

* Update getting-started

* Updates with model loader

* Update step-by-step guide

* Update defaults

* Update contributions

* reorder

* Lint

* Fix import

* fix test_workflow

* Update README.md

* Add demo/notebook

* Add heading

* Remove outdated text

* Update Python version

* Update name
  • Loading branch information
daavoo authored Jan 17, 2025
1 parent 1868b90 commit 28ae1d9
Show file tree
Hide file tree
Showing 8 changed files with 262 additions and 27 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
<p align="center"><img src="./images/Blueprints-logo.png" width="35%" alt="Project logo"/></p>

# Structured-Q&A: a Blueprint by Mozilla.ai for answering questions about structured documents.
# Structured-QA: a Blueprint by Mozilla.ai for answering questions about structured documents.


[![](https://dcbadge.limes.pink/api/server/YuMNeuKStr?style=flat)](https://discord.gg/YuMNeuKStr)
[![Docs](https://github.com/mozilla-ai/structured-q-a/actions/workflows/docs.yaml/badge.svg)](https://github.com/mozilla-ai/structured-q-a/actions/workflows/docs.yaml/)
[![Tests](https://github.com/mozilla-ai/structured-q-a/actions/workflows/tests.yaml/badge.svg)](https://github.com/mozilla-ai/structured-q-a/actions/workflows/tests.yaml/)
[![Ruff](https://github.com/mozilla-ai/structured-q-a/actions/workflows/lint.yaml/badge.svg?label=Ruff)](https://github.com/mozilla-ai/structured-q-a/actions/workflows/lint.yaml/)
[![Docs](https://github.com/mozilla-ai/structured-qa/actions/workflows/docs.yaml/badge.svg)](https://github.com/mozilla-ai/structured-qa/actions/workflows/docs.yaml/)
[![Tests](https://github.com/mozilla-ai/structured-qa/actions/workflows/tests.yaml/badge.svg)](https://github.com/mozilla-ai/structured-qa/actions/workflows/tests.yaml/)
[![Ruff](https://github.com/mozilla-ai/structured-qa/actions/workflows/lint.yaml/badge.svg?label=Ruff)](https://github.com/mozilla-ai/structured-qa/actions/workflows/lint.yaml/)


This Blueprint demonstrates how to use open-source models and a simple LLM workflow to answer questions based on structured documents.

It is designed to showcase a simpler alternative to more complex and/or resource demanding alternatives, such as RAG systems that rely on vectorDBs and/or long-context models with large token windows.


### 👉 📖 For more detailed guidance on using this project, please visit our [Docs here](https://mozilla-ai.github.io/structured-q-a/).
### 👉 📖 For more detailed guidance on using this project, please visit our [Docs here](https://mozilla-ai.github.io/structured-qa/).


## Quick-start

Get started with structured-q-a using one of the options below:
Get started with structured-qa using one of the options below:

| Google Colab | HuggingFace Spaces | GitHub Codespaces |
| -------------| ------------------- | ----------------- |
| [![Try on Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mozilla-ai/structured-q-a/blob/main/demo/notebook.ipynb) | [![Try on Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Try%20on-Spaces-blue)](https://huggingface.co/spaces/mozilla-ai/structured-q-a) | [![Try on Codespaces](https://github.com/codespaces/badge.svg)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=888426876&skip_quickstart=true&machine=standardLinux32gb) |
| [![Try on Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mozilla-ai/structured-qa/blob/main/demo/notebook.ipynb) | [![Try on Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Try%20on-Spaces-blue)](https://huggingface.co/spaces/mozilla-ai/structured-qa) | [![Try on Codespaces](https://github.com/codespaces/badge.svg)](https://github.com/codespaces/new?hide_repo_select=true&ref=main&repo=888426876&skip_quickstart=true&machine=standardLinux32gb) |

Alternatively, you can install it from pypi:

Expand Down
2 changes: 1 addition & 1 deletion demo/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ def convert_to_sections(uploaded_file, output_dir):
)


st.title("Structured Q&A")
st.title("Structured QA")

st.header("Uploading Data")

Expand Down
239 changes: 239 additions & 0 deletions demo/notebook.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Structured Q&A"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Source code: https://github.com/mozilla-ai/structured-qa"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Docs: https://mozilla-ai.github.io/structured-qa"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## GPU Check"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, you'll need to enable GPUs for the notebook:\n",
"\n",
"- Navigate to `Edit`→`Notebook Settings`\n",
"- Select T4 GPU from the Hardware Accelerator section\n",
"- Click `Save` and accept.\n",
"\n",
"Next, we'll confirm that we can connect to the GPU:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import torch\n",
"\n",
"if not torch.cuda.is_available():\n",
" raise RuntimeError(\"GPU not available\")\n",
"else:\n",
" print(\"GPU is available!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installing dependencies"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --quiet https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.4-cu122/llama_cpp_python-0.3.4-cp311-cp311-linux_x86_64.whl\n",
"%pip install --quiet structured-qa"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Uploading data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import files\n",
"\n",
"uploaded = files.upload()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Converting document to a directory of sections"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pathlib import Path\n",
"from structured_qa.preprocessing import document_to_sections_dir\n",
"\n",
"input_file = list(uploaded.keys())[0]\n",
"sections_dir = f\"output/{Path(input_file).stem}\"\n",
"section_names = document_to_sections_dir(input_file, sections_dir)\n",
"section_names"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading model"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from structured_qa.model_loaders import load_llama_cpp_model\n",
"\n",
"model = load_llama_cpp_model(\n",
" \"bartowski/Qwen2.5-3B-Instruct-GGUF/Qwen2.5-3B-Instruct-f16.gguf\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Find, Retrieve, and Answer"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"FIND_PROMPT = \"\"\"\n",
"You are given two pieces of information:\n",
"1. A user question.\n",
"2. A list of valid section names.\n",
"\n",
"Your task is to:\n",
"- Identify exactly one `section_name` from the provided list that seems related to the user question.\n",
"- Return the `section_name` exactly as it appears in the list.\n",
"- Do NOT return any additional text, explanation, or formatting.\n",
"- Do NOT combine multiple section names into a single response.\n",
"\n",
"Here is the list of valid `section_names`:\n",
"\n",
"```\n",
"{SECTIONS}\n",
"```\n",
"\n",
"Now, based on the input question, return the single most relevant `section_name` from the list.\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ANSWER_PROMPT = \"\"\"\n",
"You are a rigorous assistant answering questions.\n",
"You only answer based on the current information available.\n",
"\n",
"The current information available is:\n",
"\n",
"```\n",
"{CURRENT_INFO}\n",
"```\n",
"\n",
"If the current information available is not enough to answer the question,\n",
"you must return the following message and nothing else:\n",
"\n",
"```\n",
"I need more info.\n",
"```\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"QUESTION = \"What optimizer was used to train the model?\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from structured_qa.workflow import find_retrieve_answer\n",
"\n",
"find_retrieve_answer(\n",
" question=QUESTION,\n",
" model=model,\n",
" sections_dir=sections_dir,\n",
" find_prompt=FIND_PROMPT,\n",
" answer_prompt=ANSWER_PROMPT,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
8 changes: 4 additions & 4 deletions docs/future-features-contributions.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,18 @@ This Blueprint is an evolving project designed to grow with the help of the open
## 🌟 **How You Can Contribute**

### 🛠️ **Enhance the Blueprint**
- Check the [Issues](https://github.com/mozilla-ai/structured-q-a/issues) page to see if there are feature requests you'd like to implement
- Refer to our [Contribution Guide](https://github.com/mozilla-ai/structured-q-a/blob/main/CONTRIBUTING.md) for more details on contributions
- Check the [Issues](https://github.com/mozilla-ai/structured-qa/issues) page to see if there are feature requests you'd like to implement
- Refer to our [Contribution Guide](https://github.com/mozilla-ai/structured-qa/blob/main/CONTRIBUTING.md) for more details on contributions

### 🎨 **Extensibility Ideas**

This Blueprint is designed to be a foundation you can build upon. By extending its capabilities, you can open the door to new applications, improve user experience, and adapt the Blueprint to address other use cases. Here are a few ideas for how you can expand its potential:


We’d love to see how you can enhance this Blueprint! If you create improvements or extend its capabilities, consider contributing them back to the project so others in the community can benefit from your work. Check out our [Contributions Guide](https://github.com/mozilla-ai/structured-q-a/blob/main/CONTRIBUTING.md) to get started!
We’d love to see how you can enhance this Blueprint! If you create improvements or extend its capabilities, consider contributing them back to the project so others in the community can benefit from your work. Check out our [Contributions Guide](https://github.com/mozilla-ai/structured-qa/blob/main/CONTRIBUTING.md) to get started!

### 💡 **Share Your Ideas**
Got an idea for how this Blueprint could be improved? You can share your suggestions through [GitHub Issues](https://github.com/mozilla-ai/structured-q-a/issues).
Got an idea for how this Blueprint could be improved? You can share your suggestions through [GitHub Issues](https://github.com/mozilla-ai/structured-qa/issues).

### 🌍 **Build New Blueprints**
This project is part of a larger initiative to create a collection of reusable starter code solutions that use open-source AI tools. If you’re inspired to create your own Blueprint, you can use the [Blueprint-template](https://github.com/new?template_name=Blueprint-template&template_owner=mozilla-ai) to get started.
Expand Down
8 changes: 4 additions & 4 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Get started with Structured-Q-A using one of the options below:
Get started with Structured-QA using one of the options below:

---

Expand Down Expand Up @@ -29,7 +29,7 @@ Get started with Structured-Q-A using one of the options below:
You can install the project from Pypi:

```bash
pip install structured-q-a
pip install structured-qa
```

Check the [Command Line Interface](./cli.md) guide.
Expand All @@ -41,8 +41,8 @@ Get started with Structured-Q-A using one of the options below:
1. **Clone the Repository**

```bash
git clone https://github.com/mozilla-ai/structured-q-a.git
cd structured-q-a
git clone https://github.com/mozilla-ai/structured-qa.git
cd structured-qa
```

2. **Install the project and its Dependencies**
Expand Down
6 changes: 3 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# **Structured-Q-A Blueprint**
# **Structured-QA Blueprint**

<div style="text-align: center;">
<img src="images/document-to-podcast-diagram.png" alt="Project Logo" style="width: 100%; margin-bottom: 1px; margin-top: 1px;">
</div>

Blueprints empower developers to easily integrate AI capabilities into their projects using open-source models and tools.

These docs are your companion to mastering the **Structured-Q-A Blueprint** a local-first approach for answering questions about your structured documents.
These docs are your companion to mastering the **Structured-QA Blueprint** a local-first approach for answering questions about your structured documents.

### Built with
- Python 3.10+
Expand All @@ -15,7 +15,7 @@ These docs are your companion to mastering the **Structured-Q-A Blueprint** a lo
---

### 🚀 **Get Started Quickly**
#### _Start building your own Structured-Q-A pipeline in minutes:_
#### _Start building your own Structured-QA pipeline in minutes:_
- **[Getting Started](getting-started.md):** Quick setup and installation instructions.

### 🔍 **Understand the System**
Expand Down
6 changes: 1 addition & 5 deletions docs/step-by-step-guide.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
# **Step-by-Step Guide: How the Structured-Q-A Blueprint Works**

Transforming static documents into engaging podcast episodes involves an integration of pre-processing, LLM-powered transcript generation, and text-to-speech generation. Here's how it all works under the hood:

---
# **Step-by-Step Guide: How the Structured-QA Blueprint Works**

## **Overview**

Expand Down
6 changes: 3 additions & 3 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
site_name: Structured Q&A
repo_url: https://github.com/mozilla-ai/structured-q-a
repo_name: structured-q-a
site_name: Structured QA
repo_url: https://github.com/mozilla-ai/structured-qa
repo_name: structured-qa

nav:
- Home: index.md
Expand Down

0 comments on commit 28ae1d9

Please sign in to comment.