Skip to content

Add dual context percentage fields to working memory endpoints #38

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 26, 2025

Conversation

abrookins
Copy link
Collaborator

@abrookins abrookins commented Jul 22, 2025

Summary

Add dual context percentage fields to working memory endpoints to provide comprehensive visibility into context window usage and auto-summarization triggers.

Changes

New Fields Added

  • context_percentage_total_used: Shows actual percentage of total context window currently used (0-100%)
  • context_percentage_until_summarization: Shows percentage until auto-summarization triggers (0-100%, reaches 100% at summarization threshold)

Implementation Details

  • Updated API calculation function _calculate_context_usage_percentages() to return both values as a tuple
  • Modified both GET /v1/working-memory/{session_id} and PUT /v1/working-memory/{session_id} endpoints
  • Updated server models (WorkingMemoryResponse) with new fields
  • Updated SDK client models to match server changes
  • Added comprehensive test coverage for both fields
  • Maintains configurable summarization threshold (default 70% via SUMMARIZATION_THRESHOLD)

Backward Compatibility

  • Replaces the previous single context_usage_percentage field
  • All existing functionality preserved with enhanced context visibility

Benefits

Users now receive complete context information:

  1. Total Usage: See exactly how much of the context window is currently used
  2. Summarization Proximity: Know how close they are to automatic summarization trigger
  3. Better Planning: Make informed decisions about message length and conversation flow

Testing

  • ✅ All unit tests passing (Python 3.10, 3.11, 3.12)
  • ✅ All integration tests passing (Redis 8.0.3, latest, redis-stack)
  • ✅ Comprehensive SDK client test coverage (35/35 tests)
  • ✅ Linting and formatting checks passed

Example Response

{
  "session_id": "example-session",
  "messages": [...],
  "context_percentage_total_used": 45.5,
  "context_percentage_until_summarization": 65.0,
  ...
}

Resolves #37

🤖 Generated with Claude Code

- Add context_usage_percentage field to WorkingMemoryResponse model
- Add _calculate_context_usage_percentage() helper function
- Update GET /v1/working-memory/{session_id} to return percentage
- Update PUT /v1/working-memory/{session_id} to return percentage based on final state (after potential summarization)
- Percentage calculated as (current_tokens / token_threshold) * 100 where token_threshold = context_window * 0.7
- Returns None when no model info provided, otherwise 0-100% value

Resolves #37

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Andrew Brookins <[email protected]>
@Copilot Copilot AI review requested due to automatic review settings July 22, 2025 21:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds context usage percentage tracking to working memory endpoints to help monitor how much of the context window is being utilized before auto-summarization is triggered. The change provides visibility into memory usage patterns and helps understand when summarization occurs.

Key Changes:

  • Added context_usage_percentage field to WorkingMemoryResponse model
  • Implemented calculation logic to determine percentage of context window used
  • Updated GET and PUT working memory endpoints to include the percentage in responses

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
agent_memory_server/models.py Added context_usage_percentage field to WorkingMemoryResponse
agent_memory_server/api.py Implemented context usage calculation and updated endpoints to return percentage
tests/test_full_integration.py Code formatting improvements for assert statements

abrookins and others added 5 commits July 24, 2025 16:33
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Resolve TypeError by properly handling the context_usage_percentage field
in WorkingMemoryResponse creation to avoid duplicate keyword arguments.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add context_usage_percentage field to WorkingMemoryResponse model
- Add comprehensive test suite for the new field covering:
  - Field creation and default values
  - Serialization behavior
  - Validation of different percentage values
  - Dictionary-to-model conversion

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Address review comments by making the 0.7 threshold configurable instead
of hardcoded. Added summarization_threshold setting that can be configured
via environment variable or config file.

- Added summarization_threshold to Settings (default: 0.7)
- Updated both _calculate_context_usage_percentage and _summarize_working_memory
  to use settings.summarization_threshold
- Improved maintainability and consistency between functions
- Allows users to customize when summarization is triggered

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Add context_percentage_total_used field showing actual context window usage (0-100%)
- Add context_percentage_until_summarization field showing percentage until auto-summarization triggers (0-100%)
- Update API calculation function to return both values as tuple
- Update server and SDK models with new fields
- Update comprehensive test coverage for both fields
- Remove old single context_usage_percentage field
- Maintain configurable summarization threshold (default 70%)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@abrookins abrookins changed the title Add context usage percentage to working memory endpoints Add dual context percentage fields to working memory endpoints Jul 25, 2025
@abrookins abrookins merged commit abb0fff into main Jul 26, 2025
10 checks passed
@abrookins abrookins deleted the claude/issue-37-20250722-2011 branch July 26, 2025 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Return % of context until summarization with working memory
1 participant