Skip to content

Conversation

@csmith49
Copy link
Collaborator

@csmith49 csmith49 commented Jan 5, 2026

Summary

Extends the rolling condenser API to support hard context resets, and adds one such strategy for the LLM summarizing condenser. This strategy triggers when a hard condensation request comes through but one can't be found.

The quality of the summary and the agent's behavior afterwards is not likely to be good. This is intended solely as a last-ditch fallback option.

Changes

  • Extends RollingCondenser base class to expose hard_context_reset -- an optional strategy for last-ditch context management
  • Implementation of a hard context reset strategy for LLMSummarizingCondenser
  • Updating the c02 condensation integration test to check for the hard context reset instead of the expected error

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works?
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
  • Is the github CI passing?

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.12-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:de3797c-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-de3797c-python \
  ghcr.io/openhands/agent-server:de3797c-python

All tags pushed for this build

ghcr.io/openhands/agent-server:de3797c-golang-amd64
ghcr.io/openhands/agent-server:de3797c-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:de3797c-golang-arm64
ghcr.io/openhands/agent-server:de3797c-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:de3797c-java-amd64
ghcr.io/openhands/agent-server:de3797c-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:de3797c-java-arm64
ghcr.io/openhands/agent-server:de3797c-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:de3797c-python-amd64
ghcr.io/openhands/agent-server:de3797c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:de3797c-python-arm64
ghcr.io/openhands/agent-server:de3797c-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:de3797c-golang
ghcr.io/openhands/agent-server:de3797c-java
ghcr.io/openhands/agent-server:de3797c-python

About Multi-Architecture Support

  • Each variant tag (e.g., de3797c-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., de3797c-python-amd64) are also available if needed

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/context/condenser
   base.py45393%62, 176–177
   llm_summarizing_condenser.py105694%130–132, 261–262, 264
TOTAL15131444270% 

@csmith49 csmith49 added the condenser-test Triggers a run of all condenser integration tests label Jan 12, 2026
@github-actions
Copy link
Contributor

Hi! I started running the condenser tests on your PR. You will receive a comment with the results shortly.

Note: These are non-blocking tests that validate condenser functionality across different LLMs.

@github-actions
Copy link
Contributor

Condenser Test Results (Non-Blocking)

These tests validate condenser functionality and do not block PR merges.

🧪 Condenser Tests Results

Overall Success Rate: 100.0%
Total Cost: $0.82
Models Tested: 2
Timestamp: 2026-01-12 22:29:50 UTC

📊 Summary

Model Overall Tests Passed Skipped Total Cost Tokens
litellm_proxy_gpt_5.1_codex_max 100.0% 2/2 3 5 $0.02 24,860
litellm_proxy_anthropic_claude_opus_4_5_20251101 100.0% 5/5 0 5 $0.80 394,617

📋 Detailed Results

litellm_proxy_gpt_5.1_codex_max

  • Success Rate: 100.0% (2/2)
  • Total Cost: $0.02
  • Token Usage: prompt: 23,788, completion: 1,072, cache_read: 14,464, reasoning: 576
  • Run Suffix: litellm_proxy_gpt_5.1_codex_max_585f429_gpt51_condenser_run_N5_20260112_222615
  • Skipped Tests: 3

Skipped Tests:

  • c01_thinking_block_condenser: Model litellm_proxy/gpt-5.1-codex-max does not support extended thinking or reasoning effort
  • c05_size_condenser: This test stresses long repetitive tool loops to trigger size-based condensation. GPT-5.1 Codex Max often declines such requests for efficiency/safety reasons.
  • c04_token_condenser: This test stresses long repetitive tool loops to trigger token-based condensation. GPT-5.1 Codex Max often declines such requests for efficiency/safety reasons.

litellm_proxy_anthropic_claude_opus_4_5_20251101

  • Success Rate: 100.0% (5/5)
  • Total Cost: $0.80
  • Token Usage: prompt: 380,120, completion: 14,497, cache_read: 334,767, cache_write: 30,897, reasoning: 2,013
  • Run Suffix: litellm_proxy_anthropic_claude_opus_4_5_20251101_585f429_opus_condenser_run_N5_20260112_222607

@csmith49 csmith49 marked this pull request as ready for review January 12, 2026 22:45
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is a solid implementation of a fallback strategy for handling unrecoverable context situations. The approach is reasonable, but there are a few important issues around error handling and observability that should be addressed. See inline comments for details.

if hard_reset_condensation is not None:
return hard_reset_condensation

# In all other situations re-raise the exception.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Nit: Comment could be more specific about which situations trigger re-raise.

For clarity, consider:

Suggested change
# In all other situations re-raise the exception.
# Re-raise if: (1) request is HARD but hard_context_reset returned None,
# or (2) request is neither SOFT nor HARD
raise e

@csmith49
Copy link
Collaborator Author

This PR looks like it works -- until you try to continue a conversation after a hard reset. The View objects can only support a single summary, so things start to break after the first non-hard-reset condensation.

I'm working on an extension to address this issue in a separate PR, and will proceed with this one when that feature is merged.

@openhands-ai
Copy link

openhands-ai bot commented Jan 13, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Agent Server
    • Pre-commit checks

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #1596 at branch `feat/hard-context-reset`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

condenser-test Triggers a run of all condenser integration tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants