fix: resolve 25 test regressions from streaming retain pipeline#836
Merged
nicoloboschi merged 1 commit intomainfrom Apr 1, 2026
Merged
fix: resolve 25 test regressions from streaming retain pipeline#836nicoloboschi merged 1 commit intomainfrom
nicoloboschi merged 1 commit intomainfrom
Conversation
The 3-phase retain pipeline (914ba79) introduced several regressions: 1. **Per-content tags lost** — streaming pipeline used `contents[0].tags` for ALL chunks, breaking tag-based visibility. Fixed by tracking chunk-to-content mapping so each chunk uses its source content's tags. 2. **Multi-document batches broken** — batches with per-content `document_id` values were merged into a single document. Fixed by grouping by document_id and processing each group independently. 3. **Migration ID collision** — `d6e7f8a9b0c1` was used by both `drop_documents_metadata` and `case_insensitive_entities_trgm_index`. Renamed trgm migration to `e8f9a0b1c2d3`, fixed chain, added missing schema prefix on DROP INDEX. 4. **Graph entity inheritance** — `get_graph_data` queried entities for observation IDs only, but observations inherit entities from source memories. Fixed by querying `all_relevant_ids`. 5. **Docstring false positives** — link_utils.py docstrings triggered the SQL schema safety test's unqualified table reference check. 6. **Config test count** — `retain_chunk_batch_size` added to `_CONFIGURABLE_FIELDS` without updating the test assertion.
46a5d9e to
470ed96
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes 25 test regressions introduced by the 3-phase streaming retain pipeline (#722):
contents[0].tagsto every chunk, breaking tag-based memory visibility/isolation. Fixed by addingchunk_to_contentmapping so each chunk preserves its source content's tags, context, event_date, etc.document_idvalues were merged into a single document with a random UUID. Fixed by grouping contents bydocument_idand processing each group independently.d6e7f8a9b0c1. Renamed trgm index migration toe8f9a0b1c2d3, fixed the dependency chain, and added missing schema prefix on DROP INDEX for multi-tenant correctness.get_graph_dataqueriedunit_entitiesfor observation IDs only, but observations inherit entities from source memories viasource_memory_ids. Fixed by queryingall_relevant_ids.link_utils.pydocstrings contained SQL-like patterns that triggered the unqualified table reference safety check.retain_chunk_batch_sizewas added to_CONFIGURABLE_FIELDSwithout updating the hierarchical config test.Test plan