Conversation
Add hindsight-llamaindex package providing persistent memory tools for LlamaIndex agents via the native BaseToolSpec pattern. Includes retain, recall, and reflect tools, a convenience factory, global config, full test suite, docs page, blog post, and integrations.json entry. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix ReActAgent API: from_tools() → constructor, chat() → await run() - Add create_bank step to all quickstart examples - Add production patterns section to docs (tags, error handling, bank lifecycle) - Add memory scoping recommendation to README - Add when-not-to-use section to blog post - Add LlamaIndex compatibility tests (agent acceptance, FunctionTool.call) - Fix self-hosted auth wording in cookbook notebook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use await client.acreate_bank() instead of sync create_bank() to avoid "event loop already running" errors in notebooks and async contexts - Wrap plain Python examples in async def main() + asyncio.run(main()) so they are copy-paste runnable as scripts - Add Jupyter notebook tip to docs showing top-level await pattern - Bank lifecycle example in docs now uses async acreate_bank Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
HindsightToolSpec now provides both sync and async tool implementations
using LlamaIndex's (sync_fn, async_fn) tuple pattern in spec_functions.
Async agents (ReActAgent, etc.) use aretain/arecall/areflect natively,
avoiding the "Timeout context manager should be used inside a task"
error that occurred when sync _run_async() was called from within an
active event loop.
- Add aretain_memory, arecall_memory, areflect_on_memory async methods
- Extract shared kwargs builders (_retain_kwargs, _recall_kwargs, etc.)
- spec_functions now uses tuples: [("retain_memory", "aretain_memory"), ...]
- Tests verify tools have both sync fn and async fn set
- Notebook verified end-to-end with nbclient against local Hindsight
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The blog post will be pulled in separately from its own PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
nicoloboschi
left a comment
There was a problem hiding this comment.
Review
Overall the PR is well-structured — clean separation of concerns (tools.py, config.py, _client.py, errors.py), comprehensive tests (34 unit tests), and good docs. A few things worth discussing:
Package structure doesn't follow LlamaIndex conventions
The PR uses hindsight_llamaindex/ as a standalone package. The standard LlamaIndex community integration uses namespace packages:
# PR has:
hindsight-integrations/llamaindex/hindsight_llamaindex/
# LlamaIndex convention:
llama-index-tools-hindsight/llama_index/tools/hindsight/base.py
With LlamaIndex, llama_index/ and llama_index/tools/ must NOT have __init__.py (PEP 420 implicit namespace packages). Main class goes in base.py, not tools.py.
If the package is ever submitted to LlamaHub or users expect from llama_index.tools.hindsight import HindsightToolSpec, it won't work. Fine as a standalone package but breaks the LlamaIndex ecosystem path.
Suggestion: Either restructure now, or document explicitly that this is standalone (not LlamaHub) and plan migration later.
Missing BaseMemory implementation
The PR only implements BaseToolSpec (agent-driven retain/recall/reflect). LlamaIndex also has BaseMemory which provides automatic memory — get() enriches prompts with recalled memories transparently, put() auto-retains messages.
The Mem0 integration (llama-index-memory-mem0) does both: BaseMemory for automatic recall/retain + tools for explicit agent control. This would align better with how Claude Code 0.3.x works — recall injected automatically into UserPromptSubmit hooks, retain on Stop events.
Consistency gaps with other Hindsight integrations
| Feature | Claude Code 0.3.x | CrewAI | This PR |
|---|---|---|---|
| Auto-recall (inject into prompt) | ✅ (hook) | ✅ (Storage.search) | ❌ (tool only) |
| Auto-retain (on conversation end) | ✅ (hook) | ✅ (Storage.save) | ❌ (tool only) |
| Bank mission setup | ✅ | ✅ | ❌ |
document_id generation |
✅ (session+timestamp) | N/A | Static only |
async: true retain |
✅ | ✅ | ❌ |
context source label |
✅ ("claude-code") |
✅ | ❌ |
| Error handling | Graceful (logs, continues) | Graceful | Raises HindsightError |
Specific items:
- No
document_idauto-generation — Claude Code generates{session_id}-{timestamp}for upsert/grouping. This only supports a staticretain_document_id. - No bank mission management — other integrations call
set_bank_mission()on first use so the memory engine has context for fact extraction. - No
contextparam on retain — Claude Code passes"claude-code"as source label. - Retain doesn't use
async: true— other integrations use async retain for non-blocking storage. Valid for tool-based usage (agent expects confirmation) but worth noting.
Minor items
config.py— default URL is production (https://api.hindsight.vectorize.io) while CrewAI defaults tohttp://localhost:8888. Should be consistent across integrations._client.py— hardcodedtimeout: 30.0. Claude Code uses different timeouts per operation (retain: 15s, recall: 10s, health: 5s).- Blog post referenced in PR description but not in the diff.
Summary
hindsight-llamaindexpackage:BaseToolSpecsubclass +create_hindsight_tools()factory giving LlamaIndex agents persistent memory via retain/recall/reflecthindsight-docs/docs/sdks/integrations/llamaindex.mdhindsight-docs/blog/2026-03-23-llamaindex-memory.mdintegrations.jsonTest plan
uv run pytest tests/ -v— 34 passed, 3 skipped (manual)uv run ruff check . && uv run ruff format --check .— cleantest_manual.py🤖 Generated with Claude Code