Tutorial on using LlamaIndex-ObjectIndex-Ollama
Tested on LlamaIndex 0.10.55
Uses the LlamaIndex ObjectIndex class to create an index of all the tables in an existing database.
Then, persists the index to the local disk, to avoid having to rebuild the embeddings every time that the program runs.
Reference:
https://docs.llamaindex.ai/en/stable/examples/objects/object_index/
The LlamaIndexManager
class is designed to interface with a SQL database and the Ollama LLM local server, utilizing the LlamaIndex library to create and manage an object index for efficient querying. This class also supports persisting the object index to disk, ensuring that it can be reloaded efficiently without needing to rebuild from scratch each time.
- DatabaseManager: An external class responsible for managing database connections and operations.
- OllamaEmbedding: Utilized for embedding models using the Ollama local server.
- Ollama: Represents the LLM model used for processing queries.
- Settings: Global settings for the LlamaIndex library.
- SQLDatabase: Handles the SQL database operations.
- ObjectIndex: Manages the index of database tables for efficient retrieval and querying.
- SQLTableRetrieverQueryEngine: A query engine for retrieving data from SQL tables.
The LlamaIndexManager
is initialized with the following parameters:
db_manager
: An instance ofDatabaseManager
.ollama_embedding_model
: The name of the Ollama embedding model.ollama_base_url
: The base URL for the Ollama server.ollama_llm_model
: The name of the Ollama LLM model.persist_dir
: Directory path for persisting the index data.
def __init__(self, db_manager: DatabaseManager,
ollama_embedding_model, ollama_base_url, ollama_llm_model,
persist_dir="./svs_storage"):
The _create_object_index
method checks if the necessary files exist in the persistence directory:
- If the files exist, it loads the existing index from disk.
- If the files do not exist, it builds a new index and saves it to disk.
def _create_object_index(self):
# Check if the storage directory exists
if not os.path.exists(self.persist_dir):
os.makedirs(self.persist_dir)
# Check if all necessary files exist
files_exist = all(
os.path.exists(os.path.join(self.persist_dir, f))
for f in ['index_store.json']
)
if files_exist:
# Load existing index from disk
# notice that it requires the object mappings and the persist dir
table_node_mapping = SQLTableNodeMapping(self.sql_database)
obj_index = ObjectIndex.from_persist_dir(
persist_dir = self.persist_dir,
object_node_mapping = table_node_mapping
)
else:
# Build new index and save to disk
table_node_mapping = SQLTableNodeMapping(self.sql_database)
table_schema_objs = self._get_table_schema_objs()
obj_index = ObjectIndex.from_objects(
table_schema_objs,
table_node_mapping,
VectorStoreIndex,
)
obj_index.persist(persist_dir=self.persist_dir)
return obj_index
The _create_query_engine
method sets up a query engine using the SQLTableRetrieverQueryEngine, which enables efficient querying of the SQL database through the object index.
def _create_query_engine(self):
query_engine = SQLTableRetrieverQueryEngine(
self.sql_database,
self.obj_index.as_retriever(similarity_top_k=1),
llm=self.llm_model,
dialect="mssql"
)
return query_engine
To use the LlamaIndexManager
, initialize it with the required parameters and then call the get_query_engine
method to obtain the query engine for executing queries.
# Initialize LlamaIndexManager
index_manager = LlamaIndexManager(
db_manager=your_db_manager_instance,
ollama_embedding_model='the_name_of_your_embedding_model',
ollama_base_url='http://servername:port',
ollama_llm_model='the_name_of_your_llm_model'
)
# Get query engine
query_engine = index_manager.get_query_engine()
The ObjectIndex
can be persisted to disk to avoid the overhead of rebuilding the index. This is managed through the persist_dir
parameter, ensuring that the index is saved and loaded from the specified directory.
# Persist object index to disk
obj_index.persist(persist_dir=self.persist_dir)
Beware: The ObjectIndex class requires that the table_node_mapping
be passed in when reloading the index from disk:
table_node_mapping = SQLTableNodeMapping(self.sql_database)
obj_index = ObjectIndex.from_persist_dir(
persist_dir = self.persist_dir,
object_node_mapping = table_node_mapping
)
-
Install Ollama
https://ollama.com/download
This code requires Ollama running on your local PC, if you wish to use another LLM API, then you need to change the code. -
Pull these two models, run these commands in the command line
ollama pull lamma3
ollama pull nomic-embed-text -
Start ollama, run this in the command line
ollama serve
-
Create a new directory, create a python venv
mkdir myLLMProject
cd myLLMProject
python -m venv env
.\env\scripts\activate.bat -
Use pip to install all the python libraries
pip install -r requirements.txt -
Run the main program
python main_svs.py -
Tested on
Windows 11 Pro
Python 3.11
Used the Chinook database for testing:
https://github.com/lerocha/chinook-database