-
Notifications
You must be signed in to change notification settings - Fork 107
feat: support custom volumes mounting for DockerWorkspace #1618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR replaces the single mount_dir field with a more flexible volumes list field in DockerWorkspace, enabling users to mount multiple volumes with custom configurations. Additionally, it sets a default value for server_image.
Key changes:
- Replaced
mount_dir: str | Nonewithvolumes: list[str]to support multiple volume mounts - Changed
server_imagedefault fromNoneto"ghcr.io/openhands/agent-server:latest-python" - Updated volume mounting logic to iterate through the volumes list
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
enyst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution! I believe we did have support in OpenHands V0 for multiple volumes, so it seems didn't port it to V1
I'll start CI, and maybe it's worth noting, we have a deprecation mechanism in the SDK that we could use.
|
Thanks @enyst ! Actually I have a question on the testing part of OpenHands. `uv run pytest tests/sdk`====================================================== FAILURES =======================================================
_______________________________________________ test_fifo_lock_fairness _______________________________________________
tests/sdk/conversation/test_fifo_lock.py:142: in test_fifo_lock_fairness
assert actual_order == expected_order, (
E AssertionError: Expected FIFO order [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], got [0, 2, 1, 4, 3, 5, 6, 7, 8, 9]
E assert [0, 2, 1, 4, 3, 5, ...] == [0, 1, 2, 3, 4, 5, ...]
E
E At index 1 diff: 2 != 1
E
E Full diff:
E [
E 0,
E + 2,...
E
E ...Full output truncated (13 lines hidden), use '-vv' to show
_______________________________________ test_tool_serialization_deserialization _______________________________________
tests/sdk/tool/test_tool_serialization.py:22: in test_tool_serialization_deserialization
deserialized_tool = ToolDefinition.model_validate_json(tool_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E pydantic_core._pydantic_core.ValidationError: 1 validation error for ToolDefinition
E observation_type
E Value error, Local classes not supported! tests.sdk.tool.test_tool_definition.StrictObservation / openhands.sdk.tool.schema.Observation (Since they may not exist at deserialization time) [type=value_error, input_value='FinishObservation', input_type=str]
E For further information visit https://errors.pydantic.dev/2.12/v/value_error
_______________________________ test_tool_supports_polymorphic_field_json_serialization _______________________________
tests/sdk/tool/test_tool_serialization.py:44: in test_tool_supports_polymorphic_field_json_serialization
deserialized_container = Container.model_validate_json(container_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E pydantic_core._pydantic_core.ValidationError: 1 validation error for Container
E tool.observation_type
E Value error, Local classes not supported! tests.sdk.tool.test_tool_definition.StrictObservation / openhands.sdk.tool.schema.Observation (Since they may not exist at deserialization time) [type=value_error, input_value='FinishObservation', input_type=str]
E For further information visit https://errors.pydantic.dev/2.12/v/value_error
______________________________ test_tool_supports_nested_polymorphic_json_serialization _______________________________
tests/sdk/tool/test_tool_serialization.py:68: in test_tool_supports_nested_polymorphic_json_serialization
deserialized_container = NestedContainer.model_validate_json(container_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E pydantic_core._pydantic_core.ValidationError: 2 validation errors for NestedContainer
E tools.0.observation_type
E Value error, Local classes not supported! tests.sdk.tool.test_tool_definition.StrictObservation / openhands.sdk.tool.schema.Observation (Since they may not exist at deserialization time) [type=value_error, input_value='FinishObservation', input_type=str]
E For further information visit https://errors.pydantic.dev/2.12/v/value_error
E tools.1.observation_type
E Value error, Local classes not supported! tests.sdk.tool.test_tool_definition.StrictObservation / openhands.sdk.tool.schema.Observation (Since they may not exist at deserialization time) [type=value_error, input_value='ThinkObservation', input_type=str]
E For further information visit https://errors.pydantic.dev/2.12/v/value_error
_________________________________________ test_tool_model_validate_json_dict __________________________________________
tests/sdk/tool/test_tool_serialization.py:89: in test_tool_model_validate_json_dict
deserialized_tool = ToolDefinition.model_validate(tool_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E pydantic_core._pydantic_core.ValidationError: 1 validation error for ToolDefinition
E observation_type
E Value error, Local classes not supported! tests.sdk.tool.test_tool_definition.StrictObservation / openhands.sdk.tool.schema.Observation (Since they may not exist at deserialization time) [type=value_error, input_value='FinishObservation', input_type=str]
E For further information visit https://errors.pydantic.dev/2.12/v/value_error
________________________________________ test_tool_type_annotation_works_json _________________________________________
tests/sdk/tool/test_tool_serialization.py:130: in test_tool_type_annotation_works_json
deserialized_model = TestModel.model_validate_json(model_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E pydantic_core._pydantic_core.ValidationError: 1 validation error for TestModel
E tool.observation_type
E Value error, Local classes not supported! tests.sdk.tool.test_tool_definition.StrictObservation / openhands.sdk.tool.schema.Observation (Since they may not exist at deserialization time) [type=value_error, input_value='FinishObservation', input_type=str]
E For further information visit https://errors.pydantic.dev/2.12/v/value_error
______________________________________________ test_tool_kind_field_json ______________________________________________
tests/sdk/tool/test_tool_serialization.py:152: in test_tool_kind_field_json
deserialized_tool = ToolDefinition.model_validate_json(tool_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E pydantic_core._pydantic_core.ValidationError: 1 validation error for ToolDefinition
E observation_type
E Value error, Local classes not supported! tests.sdk.tool.test_tool_definition.StrictObservation / openhands.sdk.tool.schema.Observation (Since they may not exist at deserialization time) [type=value_error, input_value='FinishObservation', input_type=str]
E For further information visit https://errors.pydantic.dev/2.12/v/value_error
================================================== warnings summary ===================================================
tests/sdk/agent/test_non_executable_action_emission.py::test_emits_action_event_with_none_action_then_error_on_missing_tool
tests/sdk/agent/test_nonexistent_tool_handling.py::test_nonexistent_tool_returns_error_and_continues_conversation
tests/sdk/agent/test_nonexistent_tool_handling.py::test_nonexistent_tool_error_includes_available_tools
tests/sdk/agent/test_nonexistent_tool_handling.py::test_conversation_continues_after_tool_error
tests/sdk/agent/test_nonexistent_tool_handling.py::test_conversation_continues_after_tool_error
tests/sdk/agent/test_security_policy_integration.py::test_security_risk_param_ignored_when_no_analyzer
tests/sdk/agent/test_tool_execution_error_handling.py::test_tool_execution_valueerror_returns_error_event
tests/sdk/agent/test_tool_execution_error_handling.py::test_conversation_continues_after_tool_execution_error
tests/sdk/agent/test_tool_execution_error_handling.py::test_conversation_continues_after_tool_execution_error
/home/colin/code/oh-software-agent-sdk/openhands-sdk/openhands/sdk/llm/utils/telemetry.py:244: UserWarning: Cost calculation failed: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=test-model
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers
warnings.warn(f"Cost calculation failed: {e}")
tests/sdk/llm/test_llm.py::test_unmapped_model_with_logging_enabled
/home/colin/code/oh-software-agent-sdk/openhands-sdk/openhands/sdk/llm/utils/telemetry.py:244: UserWarning: Cost calculation failed: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=UnmappedTestModel
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers
warnings.warn(f"Cost calculation failed: {e}")
tests/sdk/llm/test_llm_litellm_extra_body.py::test_responses_forwards_extra_body_for_all_models
/home/colin/code/oh-software-agent-sdk/openhands-sdk/openhands/sdk/llm/utils/telemetry.py:244: UserWarning: Cost calculation failed: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=llama-3
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers
warnings.warn(f"Cost calculation failed: {e}")
tests/sdk/llm/test_telemetry_policy.py::test_responses_forwards_extra_body_for_all_models
/home/colin/code/oh-software-agent-sdk/openhands-sdk/openhands/sdk/llm/utils/telemetry.py:244: UserWarning: Cost calculation failed: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=llama-3.3-70b
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers
warnings.warn(f"Cost calculation failed: {e}")
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================== short test summary info ===============================================
FAILED tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_fairness - AssertionError: Expected FIFO order [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], got [0, 2, 1, 4, 3, 5, 6, 7, 8, 9]
FAILED tests/sdk/tool/test_tool_serialization.py::test_tool_serialization_deserialization - pydantic_core._pydantic_core.ValidationError: 1 validation error for ToolDefinition
FAILED tests/sdk/tool/test_tool_serialization.py::test_tool_supports_polymorphic_field_json_serialization - pydantic_core._pydantic_core.ValidationError: 1 validation error for Container
FAILED tests/sdk/tool/test_tool_serialization.py::test_tool_supports_nested_polymorphic_json_serialization - pydantic_core._pydantic_core.ValidationError: 2 validation errors for NestedContainer
FAILED tests/sdk/tool/test_tool_serialization.py::test_tool_model_validate_json_dict - pydantic_core._pydantic_core.ValidationError: 1 validation error for ToolDefinition
FAILED tests/sdk/tool/test_tool_serialization.py::test_tool_type_annotation_works_json - pydantic_core._pydantic_core.ValidationError: 1 validation error for TestModel
FAILED tests/sdk/tool/test_tool_serialization.py::test_tool_kind_field_json - pydantic_core._pydantic_core.ValidationError: 1 validation error for ToolDefinition
==================================== 7 failed, 1616 passed, 12 warnings in 53.08s =====================================But if I run the single first failed test case above, it can pass: ❯ uv run pytest tests/sdk/conversation/test_fifo_lock.py
==================================== test session starts ====================================
platform linux -- Python 3.12.3, pytest-8.4.2, pluggy-1.6.0 -- /home/colin/code/oh-software-agent-sdk/.venv/bin/python
cachedir: .pytest_cache
rootdir: /home/colin/code/oh-software-agent-sdk
configfile: pyproject.toml
plugins: asyncio-1.2.0, forked-1.6.0, timeout-2.4.0, libtmux-0.53.0, anyio-4.11.0, xdist-3.8.0, cov-7.0.0
asyncio: mode=Mode.AUTO, debug=False, asyncio_default_fixture_loop_scope=None, asyncio_default_test_loop_scope=function
collected 7 items
tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_basic_functionality PASSED [ 14%]
tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_context_manager PASSED [ 28%]
tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_non_blocking PASSED [ 42%]
tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_timeout PASSED [ 57%]
tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_fairness PASSED [ 71%]
tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_error_handling PASSED [ 85%]
tests/sdk/conversation/test_fifo_lock.py::test_fifo_lock_stress_test PASSED [100%]
===================================== 7 passed in 0.17s =====================================I wonder if there is anything wrong with my usage? Because I see the CI tests can pass. Got confused here. |
|
FWIW I see in your log
... This is surprising, but FWIW we just fixed something like that, maybe it makes sense to update main branch. (but I don't fully understand, it wasn't broken on main 🤔) The FIFO test failure smells like maybe it could be a bug in the tests, that make it depend on order, ugh. I wouldn't worry about your setup TBH, just maybe you could review the readme instructions to make sure you're only using |
|
Still facing the similar issue on the main branch. Created a new issue here #1629 . |
| if self.__class__ is DockerWorkspace and self.mount_dir: | ||
| warn_deprecated( | ||
| "DockerWorkspace.mount_dir", | ||
| deprecated_in="1.9.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xingyaoww Hi, thanks for your feedbacks!
- Is this what you expect? I am actually a bit confused by how to use the deprecation for an class attribute as I did not see much other usages in the repo.
- Which version number should we use for
deprecated_in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's deprecate it in 1.10.0 since it is already 1.8.1 now.
Shall we just do if self.mount_dir:?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we just do
if self.mount_dir:?
I am just copying line 129 above in the previous model validator, without knowing why self.__class__ is DockerWorkspace is needed. Do you want to change both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OpenHands do you think this is an ideal thing to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on it! xingyaoww can track my progress at all-hands.dev
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary
I addressed @xingyaoww's question about whether using self.__class__ is DockerWorkspace in _validate_mount_dir is ideal.
Answer: No, it's not ideal.
The key difference between the two validators:
-
_validate_server_image: Theself.__class__ is DockerWorkspacecheck is appropriate because subclasses likeDockerDevWorkspacemay have different image resolution logic (e.g., usingbase_imageinstead ofserver_image). -
_validate_mount_dir: The class check is problematic because:- The deprecation warning should apply to all users of
mount_dir, including subclasses - The conversion from
mount_dirtovolumesmust happen for all classes, otherwise the mount would silently fail for subclasses likeDockerDevWorkspace
- The deprecation warning should apply to all users of
Changes Made
I updated openhands-workspace/openhands/workspace/docker/workspace.py:
- Simplified the condition from
if self.__class__ is DockerWorkspace and self.mount_dir:toif self.mount_dir: - Updated
deprecated_inversion from"1.9.0"to"1.10.0"per @xingyaoww's earlier feedback
The changes have been pushed to the PR branch at #1618
- Remove unnecessary `self.__class__ is DockerWorkspace` check from _validate_mount_dir validator. Unlike _validate_server_image, the mount_dir deprecation and conversion to volumes should apply to all subclasses (e.g., DockerDevWorkspace) to ensure mounts work correctly. - Update deprecated_in version from 1.9.0 to 1.10.0 per review feedback. Co-authored-by: openhands <[email protected]>
xingyaoww
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Summary
mount_dir, but this leads to the following issues:/workspacein the container. However, OpenHands writes to this directory, such as saving conversations to/workspace/conversations./workspace/conversationsin the container./workspace, bypassing the permission issues mentioned above./workspace, when we instruct the agent to work on it, it will not read OpenHands files like the ones in/workspace/conversations, avoiding interference. For example, I observed that if we mount dirs in the old way, i.e., putting codebases in/workspace, then when the agent performs searching usinggrep, it will see the conversation history stored in/workspace, because/workspaceis the working dir, which could be very messy and confusing for the models.Welcome comments before I moving to make this PR more complete, such as implementing test cases!
Example Code:
Checklist