Include llama option in LLMs and add the section to tutorials

erblinaqeli · erblinaqeli · commit ede5a7bd3117 · 2025-03-03T00:20:18.000+01:00
diff --git a/astro.config.mjs b/astro.config.mjs
@@ -24,7 +24,12 @@ export default defineConfig({
 					label: "Guide",
 					autogenerate: { directory: "guide" },
 				},
+				{
+					label: "Tutorial",
+					autogenerate: { directory: "tutorial" },
+				},
 			],
+
 			editLink: {
 				baseUrl: config.repoUrl,
 			},
diff --git a/src/content/docs/index.mdx b/src/content/docs/index.mdx
@@ -4,15 +4,20 @@ description: KG Guidelines for KGI4NFDI.
 template: splash
 editUrl: false
 hero:
-  tagline: Building Efficient Knowledge Graphs with the tool best suited to your research requirements.
-  image:
-    file: ../../../project/KGI4NFDI.png
-  actions:
-    - text: Read the Docs
-      link: ./guide/
-      icon: right-arrow
-      variant: primary
-    - text: About KGI4NFDI
-      link: https://base4nfdi.de/projects/kgi4nfdi
-      icon: external
+    tagline: Building Efficient Knowledge Graphs with the tool best suited to your research requirements.
+    image:
+        file: ../../../project/KGI4NFDI.png
+    actions:
+        - text: Read the Docs
+          link: ./guide/
+          icon: right-arrow
+          variant: primary
+        - text: Try the Tutorial
+          link: ./tutorial/
+          icon: right-arrow
+          variant: secondary
+
+        - text: About KGI4NFDI
+          link: https://base4nfdi.de/projects/kgi4nfdi
+          icon: external
 ---
diff --git a/src/content/docs/tutorial/KG_llms.md b/src/content/docs/tutorial/KG_llms.md
@@ -5,57 +5,63 @@ sidebar:
   order: 3
 ---
 
-Knowledge Graphs (KGs) combined with Large Language Models (LLMs) offer powerful solutions for data-driven applications. This guide showcases practical examples of how to integrate LLMs with Knowledge Graphs using tools like Virtuoso and Apache Jena.
+Knowledge Graphs (KGs) combined with Large Language Models (LLMs) offer powerful solutions for data-driven applications.  
+This guide showcases practical examples of how to integrate LLMs with Knowledge Graphs using tools like **Virtuoso and Apache Jena**.
+
+The examples use **Llama** (an open-source LLM), but you can also use OpenAI models if you have an API key.
 
 ---
 
 ## Example 1: Querying Knowledge Graphs with LLMs
 
 <details>
 
-### Overview
+### **Overview**
+
+In this example, we demonstrate how to **query a Virtuoso Knowledge Graph using a Large Language Model (LLM)** to retrieve meaningful insights from structured data.
 
-In this example, we demonstrate how to query a Virtuoso Knowledge Graph using a Large Language Model (LLM) to retrieve meaningful insights from structured data. The core idea is to bridge the gap between natural language queries and structured data stored in RDF format within Virtuoso.The integration leverages `llama_index`, an interface that connects LLMs to structured data sources like SPARQL endpoints.
+The key idea is to **bridge the gap between natural language queries and structured data stored in RDF format within Virtuoso**.  
+The integration leverages `llama_index`, an interface that connects LLMs to structured data sources like **SPARQL endpoints**.
 
 ---
 
-## Prerequisites
+## **Prerequisites**
 
-### System Requirements:
+### **System Requirements**
 
-- **Python 3.x** installed.
-- **Virtuoso Server** running with SPARQL authentication enabled.
+- **Python 3.x** installed
+- **Virtuoso Server** running with SPARQL authentication enabled
 
-### Required Installations:
+### **Required Installations**
 
-1. **Uninstall existing LlamaIndex (if any):**
+#### **1️⃣ Install LlamaIndex and Dependencies**
 
-   ```bash
-   pip uninstall llama_index -y
-   ```
+```bash
+pip uninstall llama_index -y  # Remove old versions
+pip install git+https://github.com/OpenLinkSoftware/llama_index
+```
 
-2. **Install OpenLink's fork of LlamaIndex:**
+#### **2️⃣ Install Llama for Local Use**
 
-   ```bash
-   pip install git+https://github.com/OpenLinkSoftware/llama_index
-   ```
+```bash
+pip install llama-cpp-python
+```
 
-3. **Set OpenAI API Key:**
+#### **3️⃣ Set Up Llama Model**
 
-   ```bash
-   export OPENAI_API_KEY=your_openai_api_key_here
-   ```
+Download a Llama model (e.g., `llama-2-7b.Q4_K_M.gguf`) and place it in your working directory.
 
-4. **Create a directory for graph data storage:**
-   ```bash
-   mkdir llama_storage_graph
-   ```
+#### **4️⃣ Create a Directory for Graph Data Storage**
+
+```bash
+mkdir llama_storage_graph
+```
 
 ---
 
-## Configuration
+## **Configuration**
 
-### SPARQL Endpoint Details:
+### **SPARQL Endpoint Details**
 
 Update the following connection details in your Python script:
 
@@ -67,34 +73,21 @@ USER = 'dba'
 PASSWORD = 'dba'
 ```
 
-### OpenAI API Configuration:
-
-```python
-import os
-import openai
-
-openai.api_key = os.environ["OPENAI_API_KEY"]
-```
-
 ---
 
-## Full Python Code (`llama_test.py`)
+## **Full Python Code (`llama_test.py`)**
 
 ```python
-from llama_index import download_loader
 import os
 from llama_index import KnowledgeGraphIndex, ServiceContext
 from llama_index.storage.storage_context import StorageContext
 from llama_index.graph_stores import SparqlGraphStore
-from llama_index.llms import OpenAI
-from llama_index import load_index_from_storage
-import openai
+from llama_cpp import Llama
 
-# OpenAI API Key
-openai.api_key = os.environ["OPENAI_API_KEY"]
+# Load the Llama model
+llm = Llama(model_path="llama-2-7b.Q4_K_M.gguf", temperature=0.5)
 
-# Initialize LLM
-llm = OpenAI(temperature=0, model="text-davinci-002")
+# Initialize ServiceContext with Llama
 service_context = ServiceContext.from_defaults(llm=llm, chunk_size=512)
 
 # Virtuoso SPARQL Endpoint Configuration
@@ -116,7 +109,7 @@ graph_store = SparqlGraphStore(
 
 # Load Index from Storage
 storage_context = StorageContext.from_defaults(persist_dir='./llama_storage_graph', graph_store=graph_store)
-kg_index = load_index_from_storage(
+kg_index = KnowledgeGraphIndex(
     storage_context=storage_context,
     service_context=service_context,
     max_triplets_per_chunk=10,
@@ -143,112 +136,79 @@ print(str(response_graph_rag))
 
 ---
 
-## Running the Code
+## **Running the Code**
 
-Execute the script in your terminal:
+Execute the script:
 
 ```bash
 python llama_test.py
 ```
 
 ---
 
-## Expected Output
-
-When the code is executed, we expect the output to provide an insightful answer extracted from the Knowledge Graph:
+## **Expected Output**
 
 ```bash
 Ken thinks about his identity, purpose, and the meaning of life, reflecting on his role beyond just being a supporting character.
 ```
 
-This response is generated based on the RDF triples extracted from the Virtuoso knowledge graph.
+This response is generated based on RDF triples extracted from the Virtuoso knowledge graph.
 
 ---
 
-## Key Concepts
+## **Key Concepts**
 
 - **Virtuoso Integration:** The example connects to a Virtuoso SPARQL endpoint for querying RDF data.
-- **LLM Query Processing:** LLM enhances the query with natural language understanding, making it user-friendly.
+- **LLM Query Processing:** The LLM enhances the query with natural language understanding, making it user-friendly.
 - **Knowledge Graph Indexing:** The Knowledge Graph Index improves retrieval efficiency by organizing data into meaningful chunks.
 
----
-
-## Troubleshooting Tips
-
-- **Connection Errors:** Ensure Virtuoso is running and accessible via the specified SPARQL endpoint.
-- **Authentication Issues:** Verify that the provided `USER` and `PASSWORD` have the necessary SPARQL access rights.
-- **API Key Errors:** Confirm that the OpenAI API key is correctly set in the environment variables.
+</details>
 
 ---
 
-## Expanding the Dataset
-
-You can modify the SPARQL queries to explore more data points. For example:
-
-```python
-response_graph_rag = kg_rag_query_engine.query("Who is Barbie?")
-print(str(response_graph_rag))
-```
-
-### Expected Output:
-
-```bash
-Barbie is a character who thinks about becoming human and living in the real world. She also contemplates what it means to be human.
-```
-
- </details>
-
-## Example 2: Extracting Triples from Text Using LLMs
+## **Example 2: Extracting Triples from Text Using LLMs**
 
 <details>
 
-### Overview
+### **Overview**
+
+This example demonstrates how to **automatically extract structured knowledge from unstructured text**.  
+Using **Llama**, we transform plain text into **triples (subject, predicate, object)** suitable for **building Knowledge Graphs**.
 
-In this example, we showcase how to automatically extract structured knowledge (in the form of triples: subject, relation, object) from unstructured text using a Large Language Model (LLM). The goal is to transform plain text into a format suitable for building knowledge graphs, which can later be queried using SPARQL or integrated with systems like Virtuoso or Apache Jena.
+---
 
-### Full Python Code (`kg_generator.py`)
+## **Full Python Code (`kg_generator.py`)**
 
 ```python
-from openai import OpenAI
+import os
 import csv
+from llama_cpp import Llama
+
+# Load Llama model
+llm = Llama(model_path="llama-2-7b.Q4_K_M.gguf", temperature=0.5)
 
-# Set up OpenAI API key
-client = OpenAI(api_key="")
+def query_llm(prompt):
+    response = llm(prompt, max_tokens=200)
+    return response["choices"][0]["text"].strip()
 
 # Sample text for extracting entities and relationships
 text = """
 Barack Obama was born in Hawaii. He was the 44th President of the United States.
 Michelle Obama is his wife. They have two daughters, Malia and Sasha.
 """
 
-# Function to extract entities and relationships
-def extract_entities_relations(text):
-    prompt = f"""
-    Extract entities and relationships from the following text in the form of triples (subject, relation, object):
-
-    Text: {text}
+# Generate extraction prompt
+prompt = f"""
+Extract entities and relationships from the following text in the form of triples (subject, relation, object):
 
-    Format:
-    (Subject, Relation, Object)
-    """
+Text: {text}
 
-    # Correct ChatCompletion API call using the instantiated client
-    response = client.chat.completions.create(
-        model="gpt-4",
-        messages=[
-            {"role": "system", "content": "You are an assistant that extracts entities and relationships from text."},
-            {"role": "user", "content": prompt}
-        ],
-        temperature=0.5,
-        max_tokens=200
-    )
-
-    return response.choices[0].message.content.strip()
+Format:
+(Subject, Relation, Object)
+"""
 
-# Extract entities and relationships from the sample text
-extracted_triples = extract_entities_relations(text)
+extracted_triples = query_llm(prompt)
 
-# Display the extracted triples
 print("Extracted Triples:")
 print(extracted_triples)
 
@@ -265,7 +225,7 @@ print(f"\nTriples successfully saved to {csv_filename}")
 
 ---
 
-### Running the Code
+## **Running the Code**
 
 Execute the script:
 
@@ -275,7 +235,7 @@ python kg_generator.py
 
 ---
 
-### Expected Output
+## **Expected Output**
 
 ```bash
 Extracted Triples:
@@ -290,7 +250,7 @@ Triples successfully saved to extracted_triples.csv
 
 ---
 
-### CSV Output
+## **CSV Output**
 
 ```csv
         Subject,Predicate,Object
@@ -303,6 +263,21 @@ Triples successfully saved to extracted_triples.csv
 
 This file can now be imported into Virtuoso or Apache Jena as part of a knowledge graph.
 
+</details>
+
+---
+
+## **Alternative: Using OpenAI**
+
+If you prefer a cloud-based solution, you can replace **Llama** with **OpenAI** by installing `openai` and setting up an API key:
+
+```bash
+pip install openai
+export OPENAI_API_KEY=your_openai_api_key_here
+```
+
+Modify the code to replace **Llama** with OpenAI’s `gpt-4` model.
+
 ---
 
 </details>
diff --git a/src/content/docs/tutorial/index.md b/src/content/docs/tutorial/index.md