Skip to content

Commit ede5a7b

Browse files
committed
Include llama option in LLMs and add the section to tutorials
1 parent e777608 commit ede5a7b

File tree

4 files changed

+163
-126
lines changed

4 files changed

+163
-126
lines changed

astro.config.mjs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,12 @@ export default defineConfig({
2424
label: "Guide",
2525
autogenerate: { directory: "guide" },
2626
},
27+
{
28+
label: "Tutorial",
29+
autogenerate: { directory: "tutorial" },
30+
},
2731
],
32+
2833
editLink: {
2934
baseUrl: config.repoUrl,
3035
},

src/content/docs/index.mdx

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,20 @@ description: KG Guidelines for KGI4NFDI.
44
template: splash
55
editUrl: false
66
hero:
7-
tagline: Building Efficient Knowledge Graphs with the tool best suited to your research requirements.
8-
image:
9-
file: ../../../project/KGI4NFDI.png
10-
actions:
11-
- text: Read the Docs
12-
link: ./guide/
13-
icon: right-arrow
14-
variant: primary
15-
- text: About KGI4NFDI
16-
link: https://base4nfdi.de/projects/kgi4nfdi
17-
icon: external
7+
tagline: Building Efficient Knowledge Graphs with the tool best suited to your research requirements.
8+
image:
9+
file: ../../../project/KGI4NFDI.png
10+
actions:
11+
- text: Read the Docs
12+
link: ./guide/
13+
icon: right-arrow
14+
variant: primary
15+
- text: Try the Tutorial
16+
link: ./tutorial/
17+
icon: right-arrow
18+
variant: secondary
19+
20+
- text: About KGI4NFDI
21+
link: https://base4nfdi.de/projects/kgi4nfdi
22+
icon: external
1823
---
Lines changed: 86 additions & 111 deletions
Original file line numberDiff line numberDiff line change
@@ -5,57 +5,63 @@ sidebar:
55
order: 3
66
---
77

8-
Knowledge Graphs (KGs) combined with Large Language Models (LLMs) offer powerful solutions for data-driven applications. This guide showcases practical examples of how to integrate LLMs with Knowledge Graphs using tools like Virtuoso and Apache Jena.
8+
Knowledge Graphs (KGs) combined with Large Language Models (LLMs) offer powerful solutions for data-driven applications.
9+
This guide showcases practical examples of how to integrate LLMs with Knowledge Graphs using tools like **Virtuoso and Apache Jena**.
10+
11+
The examples use **Llama** (an open-source LLM), but you can also use OpenAI models if you have an API key.
912

1013
---
1114

1215
## Example 1: Querying Knowledge Graphs with LLMs
1316

1417
<details>
1518

16-
### Overview
19+
### **Overview**
20+
21+
In this example, we demonstrate how to **query a Virtuoso Knowledge Graph using a Large Language Model (LLM)** to retrieve meaningful insights from structured data.
1722

18-
In this example, we demonstrate how to query a Virtuoso Knowledge Graph using a Large Language Model (LLM) to retrieve meaningful insights from structured data. The core idea is to bridge the gap between natural language queries and structured data stored in RDF format within Virtuoso.The integration leverages `llama_index`, an interface that connects LLMs to structured data sources like SPARQL endpoints.
23+
The key idea is to **bridge the gap between natural language queries and structured data stored in RDF format within Virtuoso**.
24+
The integration leverages `llama_index`, an interface that connects LLMs to structured data sources like **SPARQL endpoints**.
1925

2026
---
2127

22-
## Prerequisites
28+
## **Prerequisites**
2329

24-
### System Requirements:
30+
### **System Requirements**
2531

26-
- **Python 3.x** installed.
27-
- **Virtuoso Server** running with SPARQL authentication enabled.
32+
- **Python 3.x** installed
33+
- **Virtuoso Server** running with SPARQL authentication enabled
2834

29-
### Required Installations:
35+
### **Required Installations**
3036

31-
1. **Uninstall existing LlamaIndex (if any):**
37+
#### **1️⃣ Install LlamaIndex and Dependencies**
3238

33-
```bash
34-
pip uninstall llama_index -y
35-
```
39+
```bash
40+
pip uninstall llama_index -y # Remove old versions
41+
pip install git+https://github.com/OpenLinkSoftware/llama_index
42+
```
3643

37-
2. **Install OpenLink's fork of LlamaIndex:**
44+
#### **2️⃣ Install Llama for Local Use**
3845

39-
```bash
40-
pip install git+https://github.com/OpenLinkSoftware/llama_index
41-
```
46+
```bash
47+
pip install llama-cpp-python
48+
```
4249

43-
3. **Set OpenAI API Key:**
50+
#### **3️⃣ Set Up Llama Model**
4451

45-
```bash
46-
export OPENAI_API_KEY=your_openai_api_key_here
47-
```
52+
Download a Llama model (e.g., `llama-2-7b.Q4_K_M.gguf`) and place it in your working directory.
4853

49-
4. **Create a directory for graph data storage:**
50-
```bash
51-
mkdir llama_storage_graph
52-
```
54+
#### **4️⃣ Create a Directory for Graph Data Storage**
55+
56+
```bash
57+
mkdir llama_storage_graph
58+
```
5359

5460
---
5561

56-
## Configuration
62+
## **Configuration**
5763

58-
### SPARQL Endpoint Details:
64+
### **SPARQL Endpoint Details**
5965

6066
Update the following connection details in your Python script:
6167

@@ -67,34 +73,21 @@ USER = 'dba'
6773
PASSWORD = 'dba'
6874
```
6975

70-
### OpenAI API Configuration:
71-
72-
```python
73-
import os
74-
import openai
75-
76-
openai.api_key = os.environ["OPENAI_API_KEY"]
77-
```
78-
7976
---
8077

81-
## Full Python Code (`llama_test.py`)
78+
## **Full Python Code (`llama_test.py`)**
8279

8380
```python
84-
from llama_index import download_loader
8581
import os
8682
from llama_index import KnowledgeGraphIndex, ServiceContext
8783
from llama_index.storage.storage_context import StorageContext
8884
from llama_index.graph_stores import SparqlGraphStore
89-
from llama_index.llms import OpenAI
90-
from llama_index import load_index_from_storage
91-
import openai
85+
from llama_cpp import Llama
9286

93-
# OpenAI API Key
94-
openai.api_key = os.environ["OPENAI_API_KEY"]
87+
# Load the Llama model
88+
llm = Llama(model_path="llama-2-7b.Q4_K_M.gguf", temperature=0.5)
9589

96-
# Initialize LLM
97-
llm = OpenAI(temperature=0, model="text-davinci-002")
90+
# Initialize ServiceContext with Llama
9891
service_context = ServiceContext.from_defaults(llm=llm, chunk_size=512)
9992

10093
# Virtuoso SPARQL Endpoint Configuration
@@ -116,7 +109,7 @@ graph_store = SparqlGraphStore(
116109

117110
# Load Index from Storage
118111
storage_context = StorageContext.from_defaults(persist_dir='./llama_storage_graph', graph_store=graph_store)
119-
kg_index = load_index_from_storage(
112+
kg_index = KnowledgeGraphIndex(
120113
storage_context=storage_context,
121114
service_context=service_context,
122115
max_triplets_per_chunk=10,
@@ -143,112 +136,79 @@ print(str(response_graph_rag))
143136

144137
---
145138

146-
## Running the Code
139+
## **Running the Code**
147140

148-
Execute the script in your terminal:
141+
Execute the script:
149142

150143
```bash
151144
python llama_test.py
152145
```
153146

154147
---
155148

156-
## Expected Output
157-
158-
When the code is executed, we expect the output to provide an insightful answer extracted from the Knowledge Graph:
149+
## **Expected Output**
159150

160151
```bash
161152
Ken thinks about his identity, purpose, and the meaning of life, reflecting on his role beyond just being a supporting character.
162153
```
163154

164-
This response is generated based on the RDF triples extracted from the Virtuoso knowledge graph.
155+
This response is generated based on RDF triples extracted from the Virtuoso knowledge graph.
165156

166157
---
167158

168-
## Key Concepts
159+
## **Key Concepts**
169160

170161
- **Virtuoso Integration:** The example connects to a Virtuoso SPARQL endpoint for querying RDF data.
171-
- **LLM Query Processing:** LLM enhances the query with natural language understanding, making it user-friendly.
162+
- **LLM Query Processing:** The LLM enhances the query with natural language understanding, making it user-friendly.
172163
- **Knowledge Graph Indexing:** The Knowledge Graph Index improves retrieval efficiency by organizing data into meaningful chunks.
173164

174-
---
175-
176-
## Troubleshooting Tips
177-
178-
- **Connection Errors:** Ensure Virtuoso is running and accessible via the specified SPARQL endpoint.
179-
- **Authentication Issues:** Verify that the provided `USER` and `PASSWORD` have the necessary SPARQL access rights.
180-
- **API Key Errors:** Confirm that the OpenAI API key is correctly set in the environment variables.
165+
</details>
181166

182167
---
183168

184-
## Expanding the Dataset
185-
186-
You can modify the SPARQL queries to explore more data points. For example:
187-
188-
```python
189-
response_graph_rag = kg_rag_query_engine.query("Who is Barbie?")
190-
print(str(response_graph_rag))
191-
```
192-
193-
### Expected Output:
194-
195-
```bash
196-
Barbie is a character who thinks about becoming human and living in the real world. She also contemplates what it means to be human.
197-
```
198-
199-
</details>
200-
201-
## Example 2: Extracting Triples from Text Using LLMs
169+
## **Example 2: Extracting Triples from Text Using LLMs**
202170

203171
<details>
204172

205-
### Overview
173+
### **Overview**
174+
175+
This example demonstrates how to **automatically extract structured knowledge from unstructured text**.
176+
Using **Llama**, we transform plain text into **triples (subject, predicate, object)** suitable for **building Knowledge Graphs**.
206177

207-
In this example, we showcase how to automatically extract structured knowledge (in the form of triples: subject, relation, object) from unstructured text using a Large Language Model (LLM). The goal is to transform plain text into a format suitable for building knowledge graphs, which can later be queried using SPARQL or integrated with systems like Virtuoso or Apache Jena.
178+
---
208179

209-
### Full Python Code (`kg_generator.py`)
180+
## **Full Python Code (`kg_generator.py`)**
210181

211182
```python
212-
from openai import OpenAI
183+
import os
213184
import csv
185+
from llama_cpp import Llama
186+
187+
# Load Llama model
188+
llm = Llama(model_path="llama-2-7b.Q4_K_M.gguf", temperature=0.5)
214189

215-
# Set up OpenAI API key
216-
client = OpenAI(api_key="")
190+
def query_llm(prompt):
191+
response = llm(prompt, max_tokens=200)
192+
return response["choices"][0]["text"].strip()
217193

218194
# Sample text for extracting entities and relationships
219195
text = """
220196
Barack Obama was born in Hawaii. He was the 44th President of the United States.
221197
Michelle Obama is his wife. They have two daughters, Malia and Sasha.
222198
"""
223199

224-
# Function to extract entities and relationships
225-
def extract_entities_relations(text):
226-
prompt = f"""
227-
Extract entities and relationships from the following text in the form of triples (subject, relation, object):
228-
229-
Text: {text}
200+
# Generate extraction prompt
201+
prompt = f"""
202+
Extract entities and relationships from the following text in the form of triples (subject, relation, object):
230203
231-
Format:
232-
(Subject, Relation, Object)
233-
"""
204+
Text: {text}
234205
235-
# Correct ChatCompletion API call using the instantiated client
236-
response = client.chat.completions.create(
237-
model="gpt-4",
238-
messages=[
239-
{"role": "system", "content": "You are an assistant that extracts entities and relationships from text."},
240-
{"role": "user", "content": prompt}
241-
],
242-
temperature=0.5,
243-
max_tokens=200
244-
)
245-
246-
return response.choices[0].message.content.strip()
206+
Format:
207+
(Subject, Relation, Object)
208+
"""
247209

248-
# Extract entities and relationships from the sample text
249-
extracted_triples = extract_entities_relations(text)
210+
extracted_triples = query_llm(prompt)
250211

251-
# Display the extracted triples
252212
print("Extracted Triples:")
253213
print(extracted_triples)
254214

@@ -265,7 +225,7 @@ print(f"\nTriples successfully saved to {csv_filename}")
265225

266226
---
267227

268-
### Running the Code
228+
## **Running the Code**
269229

270230
Execute the script:
271231

@@ -275,7 +235,7 @@ python kg_generator.py
275235

276236
---
277237

278-
### Expected Output
238+
## **Expected Output**
279239

280240
```bash
281241
Extracted Triples:
@@ -290,7 +250,7 @@ Triples successfully saved to extracted_triples.csv
290250

291251
---
292252

293-
### CSV Output
253+
## **CSV Output**
294254

295255
```csv
296256
Subject,Predicate,Object
@@ -303,6 +263,21 @@ Triples successfully saved to extracted_triples.csv
303263

304264
This file can now be imported into Virtuoso or Apache Jena as part of a knowledge graph.
305265

266+
</details>
267+
268+
---
269+
270+
## **Alternative: Using OpenAI**
271+
272+
If you prefer a cloud-based solution, you can replace **Llama** with **OpenAI** by installing `openai` and setting up an API key:
273+
274+
```bash
275+
pip install openai
276+
export OPENAI_API_KEY=your_openai_api_key_here
277+
```
278+
279+
Modify the code to replace **Llama** with OpenAI’s `gpt-4` model.
280+
306281
---
307282

308283
</details>

0 commit comments

Comments
 (0)