fix: prioritize mentioned_at over occurred_start in temporal scoring by dcbouius · Pull Request #177 · vectorize-io/hindsight

dcbouius · 2026-01-19T15:55:13Z

Problem

When users pass a timestamp to retain(), they expect that timestamp to be used for temporal queries. However, the current implementation prioritizes occurred_start (LLM-extracted) over mentioned_at (user-provided), causing unintuitive behavior.

Current behavior:

# User passes explicit timestamp
await client.retain(bank_id, "Professor rating is 5 stars", {
    timestamp: new Date('2025-01-15')  # User's intent: this happened on Jan 15
})

# But temporal scoring prioritizes:
# 1. occurred_start (LLM-extracted, often null or wrong)
# 2. mentioned_at (user-provided, always correct) ← ignored if occurred_start exists

The issues:

mentioned_at is always reliably set from the user's timestamp parameter
occurred_start is non-deterministic - depends on LLM extraction, often null
Users expect their explicit timestamp to work for temporal queries

Evidence from testing:

Running the same retain() call 3 times with identical timestamps:

Memory	Run 1	Run 2	Run 3	`mentioned_at`
A	`occurred_start` ✓	`occurred_start` ✓	`occurred_start` ✗	Always ✓
B	`occurred_start` ✗	`occurred_start` ✗	`occurred_start` ✗	Always ✓
C	`occurred_start` ✗	`occurred_start` ✓	`occurred_start` ✓	Always ✓

mentioned_at is 100% reliable. occurred_start is ~44% reliable.

Solution

Reverse the priority order so user-provided timestamps take precedence:

Before (current):

best_date = None
if ep["occurred_start"] is not None and ep["occurred_end"] is not None:
    best_date = ep["occurred_start"] + (ep["occurred_end"] - ep["occurred_start"]) / 2
elif ep["occurred_start"] is not None:
    best_date = ep["occurred_start"]
elif ep["occurred_end"] is not None:
    best_date = ep["occurred_end"]
elif ep["mentioned_at"] is not None:
    best_date = ep["mentioned_at"]  # ← Only used if everything else is null

After (proposed):

# Priority: mentioned_at (user-provided) > occurred_start/end (LLM-extracted)
if ep["mentioned_at"] is not None:
    best_date = ep["mentioned_at"]  # ← User's explicit timestamp, always reliable
elif ep["occurred_start"] is not None and ep["occurred_end"] is not None:
    best_date = ep["occurred_start"] + (ep["occurred_end"] - ep["occurred_start"]) / 2
elif ep["occurred_start"] is not None:
    best_date = ep["occurred_start"]
elif ep["occurred_end"] is not None:
    best_date = ep["occurred_end"]

Rationale

User intent should be respected - When a user explicitly passes a timestamp, that's the authoritative source
Reliability over inference - A 100% reliable user-provided value should trump a ~44% reliable LLM-extracted value
Backward compatible - If user doesn't provide timestamp, mentioned_at will be null/default and LLM-extracted dates still work
Intuitive API - The timestamp parameter will now "just work" for temporal queries

Files Changed

hindsight-api/hindsight_api/engine/search/retrieval.py
- retrieve_temporal() - 2 occurrences
- retrieve_temporal_combined() - 2 occurrences
- _get_temporal_entry_points() - 1 occurrence
hindsight-api/tests/test_temporal_scoring_priority.py (NEW)
- Unit tests for best_date priority logic
- Integration tests for temporal retrieval
- Documents old vs new behavior

Testing

After this change:

Temporal queries using query_timestamp will correctly filter by mentioned_at
Facts with explicit timestamps will score correctly in temporal relevance
LLM-extracted dates still work as fallback for facts without explicit timestamps

Migration

No migration needed. This is a behavior change that makes the API work as users expect.

The current temporal scoring logic prioritizes LLM-extracted occurred_start/end over the user-provided mentioned_at timestamp. This causes issues because: 1. mentioned_at is ALWAYS reliably set from the timestamp parameter passed to retain() 2. occurred_start/end is LLM-extracted and non-deterministic (often null) 3. Users expect their explicit timestamps to be used for temporal queries This change reverses the priority so that: 1. mentioned_at (user-provided, reliable) is checked first 2. occurred_start/end (LLM-extracted) is used as fallback when mentioned_at is null Modified 5 locations in retrieval.py: - retrieve_temporal_combined() entry points and neighbors - retrieve_temporal() entry points and neighbors - _get_temporal_entry_points() Added test_temporal_scoring_priority.py with unit tests for the new behavior.

nicoloboschi

the assumption of starting from occurred is that users will ask "when X happened" rather than "when I said about X"
since mentioned_at is always NOT NULL (we set now() if not passed) this change will defeat almost entirely the purpose of occurred time extraction

dcbouius requested a review from nicoloboschi January 19, 2026 15:59

nicoloboschi requested changes Jan 19, 2026

View reviewed changes

nicoloboschi closed this Jan 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prioritize mentioned_at over occurred_start in temporal scoring#177

fix: prioritize mentioned_at over occurred_start in temporal scoring#177
dcbouius wants to merge 1 commit intomainfrom
fix/temporal-scoring-priority

dcbouius commented Jan 19, 2026

Uh oh!

nicoloboschi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dcbouius commented Jan 19, 2026

Problem

Current behavior:

The issues:

Evidence from testing:

Solution

Before (current):

After (proposed):

Rationale

Files Changed

Testing

Migration

Uh oh!

nicoloboschi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants