refactor: remove reconciliation methods, use runtime agent directly #1542

enyst · 2025-12-29T19:03:30Z

Summary

This PR removes all reconciliation methods (resolve_diff_from_deserialized) and uses the provided Agent directly when restoring conversations. This is an alternative approach to issue #1451.

What Was Happening on Main

On main, when restoring a conversation, resolve_diff_from_deserialized would:

Override from runtime:

agent_context (skills, system_message_suffix, user_message_suffix, secrets)
llm secrets (api_key, aws credentials, litellm_extra_body)
condenser.llm secrets

Restore from persistence (and require exact match with runtime):

tools
mcp_config
filter_tools_regex
system_prompt_filename
security_policy_filename
system_prompt_kwargs
condenser (except its llm secrets)
llm config (model, temperature, etc.)

The final equality check meant users effectively couldn't change most Agent configuration between sessions.

What This PR Does

Removes reconciliation. The provided Agent is used directly - subject to limitations that would otherwise not work at all, such as, it has to be the same Agent class, or the same tools.

Users are now free to change Agent configuration between sessions:

llm (model, api_key, all settings)
mcp_config
filter_tools_regex
agent_context
system_prompt_filename
security_policy_filename
system_prompt_kwargs
condenser

Limitations:

tools
Agent's class/type
non-Agent state attributes like Confirmation Policy

Execution Flow

New Conversation:

Create ConversationState with the provided Agent (Pydantic validation happens here)
Initialize EventLog for event storage
Save initial base state to persistence
Return the new state

Restored Conversation:

Load persisted base_state.json (to get conversation metadata)
Verify conversation ID matches
Create ConversationState with the provided Agent (Pydantic validation happens here)
Restore persisted conversation metadata (execution_status, confirmation_policy, etc.)
Attach EventLog to load persisted events
Save updated base state (with the provided Agent)
Return the resumed state

Validation

Pydantic validation happens when creating instances (LLM, Agent, ConversationState) via the constructor.

Note on Tools

From issue #1533 , tools already used in the conversation history are still available.

Scope: LocalConversation Only

This PR only affects LocalConversation.

For RemoteConversation, the server always creates the Agent from the persisted meta.json - the client's Agent is ignored when restoring. Making RemoteConversation support Agent changes would require:

Client sends new Agent config when attaching to existing conversation
Server accepts and uses the new Agent config instead of persisted one

This is out of scope for this PR but could be a follow-up.

Closes #1451

Checklist

If the PR is changing/adding functionality, are there tests to reflect this?
If there is an example, have you run the example to make sure that it works?
If there are instructions on how to run the code, have you followed the instructions and made sure that it works?
If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name?
Is the github CI passing?

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:521d39f-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-521d39f-python \
  ghcr.io/openhands/agent-server:521d39f-python

All tags pushed for this build

ghcr.io/openhands/agent-server:521d39f-golang-amd64
ghcr.io/openhands/agent-server:521d39f-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:521d39f-golang-arm64
ghcr.io/openhands/agent-server:521d39f-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:521d39f-java-amd64
ghcr.io/openhands/agent-server:521d39f-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:521d39f-java-arm64
ghcr.io/openhands/agent-server:521d39f-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:521d39f-python-amd64
ghcr.io/openhands/agent-server:521d39f-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:521d39f-python-arm64
ghcr.io/openhands/agent-server:521d39f-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:521d39f-golang
ghcr.io/openhands/agent-server:521d39f-java
ghcr.io/openhands/agent-server:521d39f-python

About Multi-Architecture Support

Each variant tag (e.g., 521d39f-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 521d39f-python-amd64) are also available if needed

github-actions · 2025-12-29T19:06:29Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
openhands-sdk/openhands/sdk/agent
base.py	169	24	85%	164, 170, 238–239, 250–252, 265, 273–274, 368–369, 371–375, 377–378, 389, 426–427, 437–438
openhands-sdk/openhands/sdk/conversation
state.py	159	22	86%	225, 255, 300–302, 318–319, 325, 331–334, 338, 344–347, 374, 392, 401, 416, 422
openhands-sdk/openhands/sdk/llm
llm.py	402	153	61%	344, 349, 353, 357–358, 361, 365–366, 377–378, 380–381, 385, 402, 420–423, 500–502, 523, 527, 542, 548–549, 573–574, 584, 609–614, 635–636, 639, 643, 655, 660–663, 672, 680–687, 691–694, 696, 709, 713–714, 716–717, 722–723, 725, 732, 735–740, 797–802, 859–860, 863–866, 908, 925, 979, 982, 985–993, 997–999, 1002, 1005–1007, 1014–1015, 1024, 1031–1033, 1037, 1039–1044, 1046–1063, 1066–1070, 1072–1073, 1079–1088
TOTAL	14550	6912	52%

Remove all reconciliation methods (resolve_diff_from_deserialized) and use the runtime agent directly when restoring conversations. Key changes: - LLM: Remove resolve_diff_from_deserialized method entirely - AgentBase: Remove resolve_diff_from_deserialized method entirely - ConversationState.create(): Use runtime agent directly, no compatibility checking. User is free to change LLM, tools, condenser, agent_context, etc. between sessions. Execution flow for new conversation: 1. Create ConversationState with runtime agent (Pydantic validation happens here) 2. Initialize EventLog for event storage 3. Save initial base state to persistence 4. Return the new state Execution flow for restored conversation: 1. Load persisted base_state.json (only to get conversation metadata) 2. Verify conversation ID matches 3. Create ConversationState with the runtime agent (Pydantic validation happens here - runtime agent is always used) 4. Restore persisted conversation metadata (execution_status, etc.) 5. Attach EventLog to load persisted events 6. Save updated base state (with runtime agent) 7. Return the resumed state NOTE: There's a case for checking that tools already used in the conversation history are still available - see issue #1533. Closes #1451 Co-authored-by: openhands <[email protected]>

openhands-sdk/openhands/sdk/agent/base.py

Reintroduce tools restriction from the original reconcile method: - Add AgentBase.load(persisted) method that validates tools match - Tools must match between runtime and persisted agents (they may have been used in conversation history) - All other config (LLM, agent_context, condenser, etc.) can change freely Update ConversationState.create() to use agent.load() on restore path. Co-authored-by: openhands <[email protected]>

openhands-sdk/openhands/sdk/conversation/state.py

xingyaoww

LGTM! Just need to fix a few minor things!

openhands-sdk/openhands/sdk/conversation/state.py

openhands-sdk/openhands/sdk/agent/base.py

openhands-sdk/openhands/sdk/conversation/state.py

Address review comments: - Rename AgentBase.load() to AgentBase.verify() since it's a verification method, not a load method - Update docstring to say 'Verify that we can resume...' - Capture return value: verified_agent = agent.verify(persisted_state.agent) - Update tests to use verify() instead of load() Co-authored-by: openhands <[email protected]>

Address xingyaoww's review comment: instead of creating state from scratch, load persisted state and update specific fields. This is more future-proof - new fields will automatically be preserved. Co-authored-by: openhands <[email protected]>

Load persisted state but override with runtime-provided values: - agent (verified against persisted) - workspace - max_iterations - stuck_detection Keep from persisted state: - id, persistence_dir, execution_status, confirmation_policy - activated_knowledge_skills, blocked_actions, blocked_messages - secret_registry This gives the best of both approaches: future-proof for new fields while respecting user-provided runtime configuration. Co-authored-by: openhands <[email protected]>

Co-authored-by: openhands <[email protected]>

openhands-sdk/openhands/sdk/conversation/state.py

Test coverage for: - Runtime values used on resume: workspace, max_iterations - Persisted values preserved: execution_status, stuck_detection, blocked_actions, blocked_messages - Stats reset on resume (fresh session) - Conversation ID mismatch raises error Co-authored-by: openhands <[email protected]>

enyst · 2025-12-31T18:56:30Z

@OpenHands I look at the diff of this PR and I don't see all I expected to see. For example, in llm.py we have a class var named OVERRIDE... something, which is, I think, only used in the reconciliation method we removed from LLM. Please track it down and confirm what I said, and if it's unused let's clean it up.

Verify if in the other files in this PR, e.g. state or agentbase, we have something similar and do the same.

Review the code and clean it up from such redundancies or similar.

openhands-ai · 2025-12-31T18:56:38Z

I'm on it! enyst can track my progress at all-hands.dev

Co-authored-by: openhands <[email protected]>

openhands-ai · 2025-12-31T18:58:36Z

Final summary (new since last summary)

Double-check: request coverage

Tracked down LLM.OVERRIDE_ON_SERIALIZE and confirmed it was only used by the removed reconciliation method (LLM.resolve_diff_from_deserialized).
Confirmed it was unused after the refactor (no other references in the repo).
Searched for similar reconciliation/override leftovers in the other PR-touched files mentioned (agent/base.py, conversation/state.py) and found no comparable unused constants/fields to remove.
Removed the redundancy (the now-unused class var + related comments) from llm.py.
Ran pre-commit on the edited file and it passed.

Conciseness check

Changes are minimal and directly address the redundancy: 13 lines deleted in one file.
No unrelated formatting or refactors.

Git / delivery

Commit created with the required co-author line.
Changes pushed to the existing branch (openhands/remove-reconciliation-methods) to update the PR.

Diff recap (this update only):

openhands-sdk/openhands/sdk/llm/llm.py: removed unused OVERRIDE_ON_SERIALIZE class var block (13 deletions).

View full conversation

Co-authored-by: openhands <[email protected]>

github-actions · 2026-01-06T12:46:56Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-01-06T13:00:56Z

🧪 Integration Tests Results

Overall Success Rate: 87.5%
Total Cost: $2.07
Models Tested: 6
Timestamp: 2026-01-06 13:00:50 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_gpt_5.1_codex_max: 📥 View & Download Logs
litellm_proxy_vertex_ai_gemini_3_pro_preview: 📥 View & Download Logs
litellm_proxy_mistral_devstral_2512: 📥 View & Download Logs
litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_claude_sonnet_4_5_20250929: 📥 View & Download Logs

📊 Summary

Model	Overall	Integration (Required)	Behavior (Optional)	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_gpt_5.1_codex_max	88.9%	88.9%	N/A	8/9	1	10	$0.15	275,647
litellm_proxy_vertex_ai_gemini_3_pro_preview	90.0%	90.0%	N/A	9/10	0	10	$0.55	329,797
litellm_proxy_mistral_devstral_2512	77.8%	77.8%	N/A	7/9	1	10	$0.22	537,515
litellm_proxy_moonshot_kimi_k2_thinking	88.9%	88.9%	N/A	8/9	1	10	$0.43	650,161
litellm_proxy_deepseek_deepseek_chat	88.9%	88.9%	N/A	8/9	1	10	$0.06	613,346
litellm_proxy_claude_sonnet_4_5_20250929	90.0%	90.0%	N/A	9/10	0	10	$0.65	552,140

📋 Detailed Results

litellm_proxy_gpt_5.1_codex_max

Overall Success Rate: 88.9% (8/9)
Integration Tests (Required): 88.9% (8/10)
Total Cost: $0.15
Token Usage: prompt: 272,005, completion: 3,642, cache_read: 199,808, reasoning: 1,536
Run Suffix: litellm_proxy_gpt_5.1_codex_max_8e12188_gpt51_codex_run_N10_20260106_124718
Skipped Tests: 1

Skipped Tests:

t09_token_condenser: This test stresses long repetitive tool loops to trigger token-based condensation. GPT-5.1 Codex Max often declines such requests for efficiency/safety reasons.

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.0016)

litellm_proxy_vertex_ai_gemini_3_pro_preview

Overall Success Rate: 90.0% (9/10)
Integration Tests (Required): 90.0% (9/10)
Total Cost: $0.55
Token Usage: prompt: 311,194, completion: 18,603, cache_read: 165,958, reasoning: 14,712
Run Suffix: litellm_proxy_vertex_ai_gemini_3_pro_preview_8e12188_gemini_3_pro_run_N10_20260106_124719

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.02)

litellm_proxy_mistral_devstral_2512

Overall Success Rate: 77.8% (7/9)
Integration Tests (Required): 77.8% (7/10)
Total Cost: $0.22
Token Usage: prompt: 532,261, completion: 5,254
Run Suffix: litellm_proxy_mistral_devstral_2512_8e12188_devstral_2512_run_N10_20260106_124719
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.02)
t02_add_bash_hello ⚠️ REQUIRED: Shell script is not executable (Cost: $0.01)

litellm_proxy_moonshot_kimi_k2_thinking

Overall Success Rate: 88.9% (8/9)
Integration Tests (Required): 88.9% (8/10)
Total Cost: $0.43
Token Usage: prompt: 628,665, completion: 21,496, cache_read: 548,352
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_8e12188_kimi_k2_run_N10_20260106_124720
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.09)

litellm_proxy_deepseek_deepseek_chat

Overall Success Rate: 88.9% (8/9)
Integration Tests (Required): 88.9% (8/10)
Total Cost: $0.06
Token Usage: prompt: 600,543, completion: 12,803, cache_read: 563,584
Run Suffix: litellm_proxy_deepseek_deepseek_chat_8e12188_deepseek_run_N10_20260106_124712
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.0059)

litellm_proxy_claude_sonnet_4_5_20250929

Overall Success Rate: 90.0% (9/10)
Integration Tests (Required): 90.0% (9/10)
Total Cost: $0.65
Token Usage: prompt: 539,005, completion: 13,135, cache_read: 453,159, cache_write: 84,927, reasoning: 3,951
Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_8e12188_sonnet_run_N10_20260106_124720

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.0057)

…nfig from matrix Co-authored-by: openhands <[email protected]>

github-actions · 2026-01-06T13:17:38Z

Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly.

github-actions · 2026-01-06T13:40:37Z

🧪 Integration Tests Results

Overall Success Rate: 85.7%
Total Cost: $3.53
Models Tested: 6
Timestamp: 2026-01-06 13:40:32 UTC

📁 Detailed Logs & Artifacts

Click the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.

litellm_proxy_vertex_ai_gemini_3_pro_preview: 📥 View & Download Logs
litellm_proxy_claude_sonnet_4_5_20250929: 📥 View & Download Logs
litellm_proxy_deepseek_deepseek_chat: 📥 View & Download Logs
litellm_proxy_gpt_5.1_codex_max: 📥 View & Download Logs
litellm_proxy_moonshot_kimi_k2_thinking: 📥 View & Download Logs
litellm_proxy_mistral_devstral_2512: 📥 View & Download Logs

📊 Summary

Model	Overall	Integration (Required)	Behavior (Optional)	Tests Passed	Skipped	Total	Cost	Tokens
litellm_proxy_vertex_ai_gemini_3_pro_preview	90.0%	90.0%	N/A	9/10	0	10	$0.58	310,459
litellm_proxy_claude_sonnet_4_5_20250929	90.0%	90.0%	N/A	9/10	0	10	$0.68	594,933
litellm_proxy_deepseek_deepseek_chat	88.9%	88.9%	N/A	8/9	1	10	$0.07	709,920
litellm_proxy_gpt_5.1_codex_max	88.9%	88.9%	N/A	8/9	1	10	$1.50	6,134,922
litellm_proxy_moonshot_kimi_k2_thinking	88.9%	88.9%	N/A	8/9	1	10	$0.55	858,035
litellm_proxy_mistral_devstral_2512	66.7%	66.7%	N/A	6/9	1	10	$0.14	327,177

📋 Detailed Results

litellm_proxy_vertex_ai_gemini_3_pro_preview

Overall Success Rate: 90.0% (9/10)
Integration Tests (Required): 90.0% (9/10)
Total Cost: $0.58
Token Usage: prompt: 287,015, completion: 23,444, cache_read: 150,647, reasoning: 18,043
Run Suffix: litellm_proxy_vertex_ai_gemini_3_pro_preview_a2aa1b9_gemini_3_pro_run_N10_20260106_131802

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.02)

litellm_proxy_claude_sonnet_4_5_20250929

Overall Success Rate: 90.0% (9/10)
Integration Tests (Required): 90.0% (9/10)
Total Cost: $0.68
Token Usage: prompt: 582,592, completion: 12,341, cache_read: 489,125, cache_write: 92,523, reasoning: 3,644
Run Suffix: litellm_proxy_claude_sonnet_4_5_20250929_a2aa1b9_sonnet_run_N10_20260106_131803

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.02)

litellm_proxy_deepseek_deepseek_chat

Overall Success Rate: 88.9% (8/9)
Integration Tests (Required): 88.9% (8/10)
Total Cost: $0.07
Token Usage: prompt: 694,574, completion: 15,346, cache_read: 655,296
Run Suffix: litellm_proxy_deepseek_deepseek_chat_a2aa1b9_deepseek_run_N10_20260106_131802
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.01)

litellm_proxy_gpt_5.1_codex_max

Overall Success Rate: 88.9% (8/9)
Integration Tests (Required): 88.9% (8/10)
Total Cost: $1.50
Token Usage: prompt: 6,091,116, completion: 43,806, cache_read: 5,821,824, reasoning: 34,176
Run Suffix: litellm_proxy_gpt_5.1_codex_max_a2aa1b9_gpt51_codex_run_N10_20260106_131801
Skipped Tests: 1

Skipped Tests:

t09_token_condenser: This test stresses long repetitive tool loops to trigger token-based condensation. GPT-5.1 Codex Max often declines such requests for efficiency/safety reasons.

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $1.35)

litellm_proxy_moonshot_kimi_k2_thinking

Overall Success Rate: 88.9% (8/9)
Integration Tests (Required): 88.9% (8/10)
Total Cost: $0.55
Token Usage: prompt: 839,295, completion: 18,740, cache_read: 759,552
Run Suffix: litellm_proxy_moonshot_kimi_k2_thinking_a2aa1b9_kimi_k2_run_N10_20260106_131801
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.0028)

litellm_proxy_mistral_devstral_2512

Overall Success Rate: 66.7% (6/9)
Integration Tests (Required): 66.7% (6/10)
Total Cost: $0.14
Token Usage: prompt: 324,339, completion: 2,838
Run Suffix: litellm_proxy_mistral_devstral_2512_a2aa1b9_devstral_2512_run_N10_20260106_131801
Skipped Tests: 1

Skipped Tests:

t08_image_file_viewing: This test requires a vision-capable LLM model. Please use a model that supports image input.

Failed Tests:

t09_token_condenser ⚠️ REQUIRED: Condensation not triggered. Token counting may not work. (Cost: $0.0019)
t10_restore_conversation ⚠️ REQUIRED: RESTORE_LLM_CONFIG_2 is required for t10_restore_conversation (Cost: $0.0016)
t02_add_bash_hello ⚠️ REQUIRED: Shell script is not executable (Cost: $0.0084)

enyst · 2026-01-06T13:44:13Z

@OpenHands Understand this PR. Then split the integration test changes from it, in a new branch based on this branch, and make a PR to this PR.

We want to get those right, but without running it with the whole LLM matrix, and in the same time, we want this branch ready.

openhands-ai · 2026-01-06T13:44:24Z

I'm on it! enyst can track my progress at all-hands.dev

openhands-ai · 2026-01-06T13:45:55Z

Final summary (new since last summary)

✅ Verification

Working tree is clean on openhands/integration-tests-only (git status shows nothing to commit).
No local diff (git diff empty).
Branch is pushed and tracking origin/openhands/integration-tests-only.

✅ What was delivered (per your request)

Split the integration test changes out of openhands/remove-reconciliation-methods into a separate branch (openhands/integration-tests-only) so they can be reviewed/iterated independently of the larger refactor.
Opened a PR targeting the original PR branch (“PR to this PR”), as requested:
- test(integration): adjust t10 restore matrix run #1608
- Base: openhands/remove-reconciliation-methods
- Compare: openhands/integration-tests-only

🔍 Scope of the split (integration-only files)

This split PR contains only:

.github/workflows/integration-runner.yml
tests/integration/run_infer.py
tests/integration/tests/t10_restore_conversation.py

No other refactor/reconciliation-removal code is included.

View full conversation

…store config from matrix" This reverts commit a2aa1b9.

This reverts commit 8e12188.

…R)\n\nCo-authored-by: openhands <[email protected]>

enyst · 2026-01-06T14:00:19Z

@enyst hmm, i would hope to have two real llms :), but i do agree with you that we might not need to run it for all the model matrix we have.

Maybe for the context of getting this PR merged, we can do 3 for now? I'm hoping i can include this PR in the next minor release :)

@xingyaoww

I adjusted cross tests to do more, test also another detail. I do think the code should be ready for review.
I also tried running separately "2 LLMs" option in integration tests, but that didn't work, it requires more testing. I split it into its own branch, in case we want to merge this one.
To note, the original integration test with 2 LLMs worked well. It was restoring for real and continued conversation with another event. Need to figure out how to run it in github actions correctly without the matrix.
I'm meanwhile trying to test manually with a CLI based on this branch. 🤔

xingyaoww

Love it! Really appreciate the effort here @enyst - this will make it much easier to implement the model switching on the fly!

xingyaoww · 2026-01-06T15:43:58Z

@OpenHands please create an issue in OpenHands/OpenHands-CLI - based on the change of this PR, we should update how we handle agent_settings systematically in CLI.

We should only keep the non-default fields in agent_settings for CLI (instead of saving every field like .model_dump() and always override them - and keep everything else as default.

openhands-ai · 2026-01-06T15:44:08Z

I'm on it! xingyaoww can track my progress at all-hands.dev

openhands-ai · 2026-01-06T15:45:56Z

I've successfully created an issue in the OpenHands/OpenHands-CLI repository based on the changes in PR #1542.

Issue Created: #302 - Update agent_settings handling to only persist non-default fields

The issue describes:

Current Behavior: CLI saves all fields from agent_settings using .model_dump() and always overrides them when restoring conversations
Proposed Change: Only keep non-default fields in agent_settings for CLI, using approaches like model_dump(exclude_defaults=True)
Benefits: Smaller config files, better forward compatibility, cleaner semantics, and alignment with the SDK's new approach
Implementation Notes: Merge saved non-default fields with current defaults when loading

The issue is linked to the SDK PR #1542 for context.

View full conversation

enyst · 2026-01-06T15:55:53Z

This is the CLI on this branch, restoring with another LLM, remembering the events, and it did also delete that file:

openhands-ai bot mentioned this pull request Dec 29, 2025

Proposal: agent as composed of immutable instances #1451

Closed

enyst force-pushed the openhands/remove-reconciliation-methods branch from 76b5add to 5b3198d Compare December 29, 2025 19:19

enyst force-pushed the openhands/remove-reconciliation-methods branch from 5b3198d to 82b191c Compare December 29, 2025 19:24

ixchio mentioned this pull request Dec 30, 2025

Allow adding new tools when resuming conversations without failing #1533

Closed

xingyaoww reviewed Dec 30, 2025

View reviewed changes

openhands-sdk/openhands/sdk/agent/base.py Show resolved Hide resolved

enyst commented Dec 31, 2025

View reviewed changes

openhands-sdk/openhands/sdk/conversation/state.py Outdated Show resolved Hide resolved

Update openhands-sdk/openhands/sdk/conversation/state.py

8a7c73d

xingyaoww reviewed Dec 31, 2025

View reviewed changes

openhands-sdk/openhands/sdk/conversation/state.py Outdated Show resolved Hide resolved

openhands-sdk/openhands/sdk/conversation/state.py Outdated Show resolved Hide resolved

enyst commented Dec 31, 2025

View reviewed changes

openhands-sdk/openhands/sdk/agent/base.py Outdated Show resolved Hide resolved

Update openhands-sdk/openhands/sdk/agent/base.py

730de13

enyst commented Dec 31, 2025

View reviewed changes

openhands-sdk/openhands/sdk/conversation/state.py Outdated Show resolved Hide resolved

enyst and others added 4 commits December 31, 2025 17:48

fix: keep stuck_detection from persisted state

425828a

Co-authored-by: openhands <[email protected]>

enyst commented Dec 31, 2025

View reviewed changes

openhands-sdk/openhands/sdk/conversation/state.py Show resolved Hide resolved

OpenHands deleted a comment from openhands-ai bot Dec 31, 2025

enyst marked this pull request as ready for review December 31, 2025 18:47

Merge branch 'main' into openhands/remove-reconciliation-methods

abb44f1

chore: remove unused LLM OVERRIDE_ON_SERIALIZE

e55581f

Co-authored-by: openhands <[email protected]>

docs: clarify ConversationState.workspace is a workspace object

5d7afb3

Co-authored-by: openhands <[email protected]>

enyst mentioned this pull request Jan 1, 2026

(placeholder) SDK flexibility #1562

Open

Merge branch 'main' into openhands/remove-reconciliation-methods

7b37817

enyst and others added 2 commits January 6, 2026 12:32

test(cross): use top-level pytest import

ede04b7

Co-authored-by: openhands <[email protected]>

test(integration): run t10 restore once with 2 LLM configs

8e12188

Co-authored-by: openhands <[email protected]>

enyst added the integration-test Runs the integration tests and comments the results label Jan 6, 2026

fix(integration): skip t10 only for integration runs; read restore co…

a2aa1b9

…nfig from matrix Co-authored-by: openhands <[email protected]>

enyst added integration-test Runs the integration tests and comments the results and removed integration-test Runs the integration tests and comments the results labels Jan 6, 2026

enyst mentioned this pull request Jan 6, 2026

test(integration): adjust t10 restore matrix run #1608

Draft

enyst added 3 commits January 6, 2026 13:52

Revert "fix(integration): skip t10 only for integration runs; read re…

42fa809

…store config from matrix" This reverts commit a2aa1b9.

Revert "test(integration): run t10 restore once with 2 LLM configs"

1734d51

This reverts commit 8e12188.

test(integration): remove t10 restore behavior test (moved to split P…

f4b52ca

…R)\n\nCo-authored-by: openhands <[email protected]>

Merge branch 'main' into openhands/remove-reconciliation-methods

b49f1ef

xingyaoww approved these changes Jan 6, 2026

View reviewed changes

xingyaoww mentioned this pull request Jan 6, 2026

Resume fails with ValueError when condenser settings differ from persisted state OpenHands/OpenHands-CLI#271

Open

xingyaoww mentioned this pull request Jan 6, 2026

Update agent_settings handling to only persist non-default fields OpenHands/OpenHands-CLI#302

Open

enyst merged commit a918f39 into main Jan 6, 2026
21 checks passed

enyst deleted the openhands/remove-reconciliation-methods branch January 6, 2026 15:58

enyst mentioned this pull request Jan 6, 2026

Integration test for Opus thinking block constraints #1584

Draft

5 tasks

malhotra5 mentioned this pull request Jan 12, 2026

Fix verify method to include builtin tools in event check #1710

Merged

5 tasks

refactor: remove reconciliation methods, use runtime agent directly #1542

refactor: remove reconciliation methods, use runtime agent directly #1542

Uh oh!

Conversation

enyst commented Dec 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Was Happening on Main

What This PR Does

Limitations:

Execution Flow

New Conversation:

Restored Conversation:

Validation

Note on Tools

Scope: LocalConversation Only

Checklist

Uh oh!

github-actions bot commented Dec 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xingyaoww left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

enyst commented Dec 31, 2025

Uh oh!

openhands-ai bot commented Dec 31, 2025

Uh oh!

openhands-ai bot commented Dec 31, 2025

Final summary (new since last summary)

Double-check: request coverage

Conciseness check

Git / delivery

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_gpt_5.1_codex_max

litellm_proxy_vertex_ai_gemini_3_pro_preview

litellm_proxy_mistral_devstral_2512

litellm_proxy_moonshot_kimi_k2_thinking

litellm_proxy_deepseek_deepseek_chat

litellm_proxy_claude_sonnet_4_5_20250929

Uh oh!

github-actions bot commented Jan 6, 2026

Uh oh!

github-actions bot commented Jan 6, 2026

🧪 Integration Tests Results

📁 Detailed Logs & Artifacts

📊 Summary

📋 Detailed Results

litellm_proxy_vertex_ai_gemini_3_pro_preview

litellm_proxy_claude_sonnet_4_5_20250929

litellm_proxy_deepseek_deepseek_chat

litellm_proxy_gpt_5.1_codex_max

litellm_proxy_moonshot_kimi_k2_thinking

litellm_proxy_mistral_devstral_2512

Uh oh!

enyst commented Jan 6, 2026

Uh oh!

openhands-ai bot commented Jan 6, 2026

Uh oh!

openhands-ai bot commented Jan 6, 2026

Final summary (new since last summary)

✅ Verification

✅ What was delivered (per your request)

🔍 Scope of the split (integration-only files)

Uh oh!

enyst commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

enyst commented Dec 29, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Dec 29, 2025 •

edited

Loading

enyst commented Jan 6, 2026 •

edited

Loading