Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
223 changes: 223 additions & 0 deletions hindsight-docs/docs/sdks/integrations/llamaindex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
---
sidebar_position: 8
---

# LlamaIndex

Persistent long-term memory for [LlamaIndex](https://docs.llamaindex.ai/) agents via Hindsight. Uses LlamaIndex's native `BaseToolSpec` pattern to expose retain, recall, and reflect as tools that any LlamaIndex agent can use.

## Features

- **Native BaseToolSpec** — Implements `BaseToolSpec` so tools work with any LlamaIndex agent (ReAct, FunctionCalling, etc.)
- **Three Memory Operations** — retain (store), recall (search), and reflect (synthesize) as individual tools
- **Selective Tools** — Use `to_tool_list(spec_functions=...)` or `include_retain/recall/reflect` flags to expose only the tools you need
- **Global + Per-Call Config** — Set defaults via `configure()`, override per-call
- **Full Hindsight Feature Set** — Tags, metadata, document grouping, fact type filtering, entity extraction, reflect schemas

## Installation

```bash
pip install hindsight-llamaindex
```

## Quick Start: Tool Spec

Use `HindsightToolSpec` directly for full control.

```python
import asyncio
from hindsight_client import Hindsight
from hindsight_llamaindex import HindsightToolSpec
from llama_index.llms.openai import OpenAI
from llama_index.core.agent import ReActAgent

async def main():
client = Hindsight(base_url="http://localhost:8888")

# Create the memory bank first (one-time setup)
await client.acreate_bank("user-123", name="User 123 Memory")

spec = HindsightToolSpec(client=client, bank_id="user-123")
tools = spec.to_tool_list()

agent = ReActAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
response = await agent.run("Remember that I prefer dark mode")
print(response)

asyncio.run(main())
```

:::tip Jupyter Notebooks
In notebooks, use top-level `await` directly — no `asyncio.run()` needed:
```python
await client.acreate_bank("user-123", name="User 123 Memory")
response = await agent.run("Remember that I prefer dark mode")
```
:::

## Quick Start: Factory Function

Use `create_hindsight_tools()` for a simpler API.

```python
import asyncio
from hindsight_client import Hindsight
from hindsight_llamaindex import create_hindsight_tools
from llama_index.llms.openai import OpenAI
from llama_index.core.agent import ReActAgent

async def main():
client = Hindsight(base_url="http://localhost:8888")
tools = create_hindsight_tools(client=client, bank_id="user-123")

agent = ReActAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
response = await agent.run("What do you remember about me?")
print(response)

asyncio.run(main())
```

## Selecting Tools

### Via `to_tool_list()`

```python
spec = HindsightToolSpec(client=client, bank_id="user-123")

# Only recall and reflect — no retain
tools = spec.to_tool_list(spec_functions=["recall_memory", "reflect_on_memory"])
```

### Via factory flags

```python
tools = create_hindsight_tools(
client=client,
bank_id="user-123",
include_retain=True,
include_recall=True,
include_reflect=False, # exclude reflect
)
```

## Configuration

### Global config

Set connection and default parameters once. All subsequent tool creation will use these unless overridden.

```python
from hindsight_llamaindex import configure

configure(
hindsight_api_url="http://localhost:8888",
api_key="your-api-key", # or set HINDSIGHT_API_KEY env var
budget="mid",
tags=["source:llamaindex"],
)

# Now you can create tools without passing client/url
tools = create_hindsight_tools(bank_id="user-123")
```

### Per-call overrides

Pass parameters directly to `HindsightToolSpec()` or `create_hindsight_tools()` to override global config.

## API Reference

### `HindsightToolSpec()`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `bank_id` | `str` | *required* | Hindsight memory bank to operate on |
| `client` | `Hindsight` | `None` | Pre-configured Hindsight client |
| `hindsight_api_url` | `str` | `None` | API URL (used if no client provided) |
| `api_key` | `str` | `None` | API key (used if no client provided) |
| `budget` | `str` | `None` → `"mid"` | Recall/reflect budget: `low`, `mid`, `high` |
| `max_tokens` | `int` | `None` → `4096` | Max tokens for recall results |
| `tags` | `list[str]` | `None` | Tags applied when storing memories |
| `recall_tags` | `list[str]` | `None` | Tags to filter recall results |
| `recall_tags_match` | `str` | `None` → `"any"` | Tag matching: `any`, `all`, `any_strict`, `all_strict` |
| `retain_metadata` | `dict[str, str]` | `None` | Default metadata for retain operations |
| `retain_document_id` | `str` | `None` | Document ID for retain (groups/upserts memories) |
| `recall_types` | `list[str]` | `None` | Fact types: `world`, `experience`, `opinion`, `observation` |
| `recall_include_entities` | `bool` | `False` | Include entity info in recall results |
| `reflect_context` | `str` | `None` | Additional context for reflect |
| `reflect_max_tokens` | `int` | `None` | Max tokens for reflect (defaults to `max_tokens`) |
| `reflect_response_schema` | `dict` | `None` | JSON schema to constrain reflect output |
| `reflect_tags` | `list[str]` | `None` | Tags for reflect (defaults to `recall_tags`) |
| `reflect_tags_match` | `str` | `None` | Tag matching for reflect (defaults to `recall_tags_match`) |

### `create_hindsight_tools()`

Accepts all `HindsightToolSpec` parameters plus:

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `include_retain` | `bool` | `True` | Include the retain (store) tool |
| `include_recall` | `bool` | `True` | Include the recall (search) tool |
| `include_reflect` | `bool` | `True` | Include the reflect (synthesize) tool |

### `configure()`

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `hindsight_api_url` | `str` | Production URL | Hindsight API URL |
| `api_key` | `str` | `None` | API key (falls back to `HINDSIGHT_API_KEY` env var) |
| `budget` | `str` | `"mid"` | Default recall budget |
| `max_tokens` | `int` | `4096` | Default max tokens |
| `tags` | `list[str]` | `None` | Default retain tags |
| `recall_tags` | `list[str]` | `None` | Default recall filter tags |
| `recall_tags_match` | `str` | `"any"` | Default tag matching mode |
| `verbose` | `bool` | `False` | Enable verbose logging |

## Production Patterns

### Memory Scoping with Tags

Use tags to organize memories by source, conversation, or topic:

```python
spec = HindsightToolSpec(
client=client,
bank_id="user-123",
tags=["source:chat", "session:abc"], # applied to all retains
recall_tags=["source:chat"], # filter recalls to chat memories
recall_tags_match="any", # match any tag (default)
)
```

For multi-tenant applications, use one bank per user and tags per context (e.g., `project:X`, `channel:support`).

### Error Handling

All tool methods raise `HindsightError` on failure. Wrap agent execution to handle memory errors gracefully:

```python
from hindsight_llamaindex import HindsightError

try:
response = await agent.run("What do you know about me?")
except HindsightError as e:
# Memory unavailable — agent can still function without memory
logger.warning(f"Memory error: {e}")
```

### Bank Lifecycle

Banks must be created before use and should be created once per user/entity:

```python
# One-time setup (e.g., during user onboarding)
await client.acreate_bank(f"user-{user_id}", name=f"{user_name}'s Memory")

# Subsequent agent creation — bank already exists
spec = HindsightToolSpec(client=client, bank_id=f"user-{user_id}")
```

## Requirements

- Python 3.10+
- `llama-index-core >= 0.11.0`
- `hindsight-client >= 0.4.0`
10 changes: 10 additions & 0 deletions hindsight-docs/src/data/integrations.json
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,16 @@
"link": "/sdks/integrations/langgraph",
"icon": "/img/icons/langgraph.png"
},
{
"id": "llamaindex",
"name": "LlamaIndex",
"description": "Add persistent memory to LlamaIndex agents via the native BaseToolSpec pattern. Retain, recall, and reflect as standard tools.",
"type": "official",
"by": "hindsight",
"category": "framework",
"link": "/sdks/integrations/llamaindex",
"icon": "/img/icons/llamaindex.png"
},
{
"id": "nemoclaw",
"name": "NemoClaw",
Expand Down
121 changes: 121 additions & 0 deletions hindsight-integrations/llamaindex/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# hindsight-llamaindex

LlamaIndex integration for [Hindsight](https://github.com/vectorize-io/hindsight) — persistent long-term memory for AI agents.

Provides Hindsight memory as a native LlamaIndex `BaseToolSpec`, giving agents retain/recall/reflect capabilities through LlamaIndex's standard tool interface.

## Prerequisites

- A running Hindsight instance ([self-hosted via Docker](https://github.com/vectorize-io/hindsight#quick-start) or [Hindsight Cloud](https://ui.hindsight.vectorize.io/signup))
- Python 3.10+

## Installation

```bash
pip install hindsight-llamaindex
```

## Quick Start: Tool Spec

Use `HindsightToolSpec` directly for full control over tool creation.

```python
import asyncio
from hindsight_client import Hindsight
from hindsight_llamaindex import HindsightToolSpec
from llama_index.llms.openai import OpenAI
from llama_index.core.agent import ReActAgent

async def main():
client = Hindsight(base_url="http://localhost:8888")

# Create the memory bank first (one-time setup)
await client.acreate_bank("user-123", name="User 123 Memory")

spec = HindsightToolSpec(client=client, bank_id="user-123")
tools = spec.to_tool_list()

agent = ReActAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
response = await agent.run("Remember that I prefer dark mode")
print(response)

asyncio.run(main())
```

### Selective Tools

Use `to_tool_list(spec_functions=...)` to include only the tools you need:

```python
# Only recall and reflect — no retain
tools = spec.to_tool_list(spec_functions=["recall_memory", "reflect_on_memory"])
```

## Quick Start: Factory Function

Use `create_hindsight_tools()` for a simpler API with include/exclude flags.

```python
from hindsight_llamaindex import create_hindsight_tools

tools = create_hindsight_tools(
client=client,
bank_id="user-123",
include_reflect=False, # only retain + recall
)

agent = ReActAgent(tools=tools, llm=llm)
```

## Configuration

### Global config

```python
from hindsight_llamaindex import configure

configure(
hindsight_api_url="http://localhost:8888",
api_key="your-api-key", # or set HINDSIGHT_API_KEY env var
budget="mid",
tags=["source:llamaindex"],
)
```

### Per-call overrides

All factory functions accept `client`, `hindsight_api_url`, and `api_key` to override the global config.

| Parameter | Description | Default |
|-----------|-------------|---------|
| `hindsight_api_url` | Hindsight API URL | `https://api.hindsight.vectorize.io` |
| `api_key` | API key (or `HINDSIGHT_API_KEY` env var) | `None` |
| `budget` | Recall budget: `low`, `mid`, `high` | `mid` |
| `max_tokens` | Max tokens for recall results | `4096` |
| `tags` | Tags applied to retain operations | `None` |
| `recall_tags` | Tags to filter recall results | `None` |
| `recall_tags_match` | Tag matching: `any`, `all`, `any_strict`, `all_strict` | `any` |

## Memory Scoping

Use one bank per user/entity and tags to organize memories by context:

```python
spec = HindsightToolSpec(
client=client,
bank_id=f"user-{user_id}", # one bank per user
tags=["source:chat", "project:X"], # scope retains by context
recall_tags=["source:chat"], # filter recalls to chat memories
)
```

## Requirements

- Python 3.10+
- `llama-index-core >= 0.11.0`
- `hindsight-client >= 0.4.0`

## Documentation

- [Integration docs](https://docs.hindsight.vectorize.io/docs/sdks/integrations/llamaindex)
- [Hindsight API docs](https://docs.hindsight.vectorize.io)
Loading
Loading