-
Notifications
You must be signed in to change notification settings - Fork 107
fix(condenser): View properties for capturing API assumptions #1649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
|
Hi! I started running the condenser tests on your PR. You will receive a comment with the results shortly. Note: These are non-blocking tests that validate condenser functionality across different LLMs. |
Condenser Test Results (Non-Blocking)
🧪 Condenser Tests ResultsOverall Success Rate: 57.1% 📊 Summary
📋 Detailed Resultslitellm_proxy_gpt_5.1_codex_max
Skipped Tests:
Failed Tests:
litellm_proxy_anthropic_claude_opus_4_5_20251101
Failed Tests:
|
Summary
This PR is a WIP.
The goal is to refactor the
Viewobject so that all of the consistency checks -- batch structure, tool loops + thinking blocks, etc -- can be easily represented in parallel.Why
The
Viewinherits some behavior from v0'sConversationMemoryclass, which was responsible for making sure that certain properties held in the event stream before it was sent up to the LLM. Example properties:We expect the LLM API to preserve these features. They'll only be violated when we start to mess with the event stream...like the condenser does.
In recent PRs the
View/Condenserinteraction has been changed to rely on manipulation indices, where the condenser now tries to maintain the properties as it drops events. We've kept the "enforcement" side of things as a safety plan, but ideally it never matters.However, both of those processes are tightly-coupled (wrt the properties enforced) loops. To make it easier to manage these properties this PR works to separate the properties into classes that can be managed in parallel.
Design Decisions and Assumptions
View properties are simple classes that implement a two-function API:
enforce, which checks that the property holds, andmanipulation_indices, which produces a set of indices that the condenser can use while preserving the propertyIn theory only the manipulation indices are necessary, but experience has taught us that sometimes weird things happen. By keeping the enforcement function around and logging the output with big scary warnings we can recover from these weird things, with the option to disable enforcement or simplify the codebase in the future.
Manipulation indices
One reason this works is because the manipulation index calculations can be done independently and combined simply (just taking the intersection of the sets).
It's possible that future properties won't be representable within this framework. For example, maybe we're allowed to drop at most one thinking block from the previous message. There's no way to capture that with the intersection of manipulation indices.
That may never end up being a problem. But if so all we need to do is update the manipulation index representation and how the condensers use them. That's a big implementation lift but it still fits in this conceptual framework.
Checklist
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:7ae4e94-pythonRun
All tags pushed for this build
About Multi-Architecture Support
7ae4e94-python) is a multi-arch manifest supporting both amd64 and arm647ae4e94-python-amd64) are also available if needed