中文版本在下面
Background
I ran into a sandbox memory issue while investigating bytedance/deer-flow#3213. DeerFlow uses the AIO sandbox image (enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest) as its containerized sandbox runtime.
The current all-in-one model is very convenient, but in high-concurrency deployments the idle memory cost per sandbox becomes a bottleneck. In my local DeerFlow/kind test environment, the memory profile looked like this:
- Default AIO sandbox idle memory: about
769-895Mi
- With
DISABLE_JUPYTER=true and DISABLE_CODE_SERVER=true: about 413Mi
- After additionally stopping the browser/VNC/browser MCP/Openbox stack: about
143-178Mi
The browser-related services are therefore the largest remaining idle-memory component after Jupyter and code-server are disabled.
Local finding
Inside the running AIO sandbox container, I found these supervisor programs:
browser
mcp-server-browser
tigervnc
openbox
websocat
Manually stopping these services significantly reduced idle memory. Manually starting them again worked in my local test:
supervisorctl start tigervnc openbox browser mcp-server-browser websocat
After startup:
- CDP at
http://127.0.0.1:9222/json/version became ready in about 1 second.
mcp-server-browser listened on port 8100.
- Memory increased back to the expected browser-enabled level.
This suggests that a lazy-start browser stack could reduce idle memory while preserving browser functionality for workloads that actually need it.
Proposed improvement
I would like to contribute a PR to support opt-in lazy startup for the browser stack, if this is an acceptable direction.
A possible design:
- Add an opt-in environment variable, for example:
DISABLE_BROWSER_STACK=true
or:
AUTOSTART_BROWSER_STACK=false
-
Keep the current default behavior unchanged. The browser stack should still start by default unless users explicitly opt in to lazy startup.
-
When lazy startup is enabled, do not autostart these supervisor programs:
tigervnc
openbox
browser
mcp-server-browser
websocat
- Add an internal
ensure_browser_stack_started() mechanism that starts only this fixed allowlist of services and waits until they are ready.
Readiness checks could include:
- CDP
/json/version is reachable.
- browser MCP port is reachable.
- VNC/websocket proxy is running when VNC is enabled.
- Call this ensure step from browser-dependent entry points, such as browser screenshot/actions/page APIs and MCP browser tooling.
Why this helps DeerFlow
For DeerFlow users, this would allow high-concurrency deployments to keep many idle sandboxes around with much lower memory usage, while still allowing browser workloads to start the browser stack on demand.
After this is supported in AIO sandbox, DeerFlow can add a small follow-up PR to pass through a SANDBOX_DISABLE_BROWSER_STACK-style setting from its provisioner/compose config.
Questions
- Is lazy-starting the browser stack an acceptable direction for this project?
- Is the image-side source for the supervisor/browser service configuration available for external contributions?
- If yes, which files or branch should I base a PR on?
- Would maintainers prefer one combined
DISABLE_BROWSER_STACK switch, or separate switches for browser, browser MCP, and VNC?
背景
我在排查 bytedance/deer-flow#3213 时遇到了 AIO sandbox 的内存占用问题。DeerFlow 当前使用 AIO sandbox 镜像(enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest)作为容器化 sandbox runtime。
all-in-one 模式很方便,但在高并发部署里,单个 sandbox 的空闲内存会成为瓶颈。我在本地 DeerFlow/kind 环境中测到的结果大致是:
- 默认 AIO sandbox 空闲内存约
769-895Mi
- 设置
DISABLE_JUPYTER=true 和 DISABLE_CODE_SERVER=true 后约 413Mi
- 再额外停止 browser/VNC/browser MCP/Openbox 相关服务后约
143-178Mi
也就是说,在关闭 Jupyter 和 code-server 之后,browser 相关服务是剩下最大的空闲内存来源。
本地发现
在运行中的 AIO sandbox 容器里,我看到了这些 supervisor program:
browser
mcp-server-browser
tigervnc
openbox
websocat
手动停止这些服务后,空闲内存明显下降。之后我本地手动重新启动它们也可以正常恢复:
supervisorctl start tigervnc openbox browser mcp-server-browser websocat
启动后:
- CDP
http://127.0.0.1:9222/json/version 约 1 秒 ready。
mcp-server-browser 在 8100 端口可用。
- 内存回升到启用 browser 栈后的水平。
这说明 browser 栈按需启动是可行的:空闲时降低内存,需要 browser 能力时再启动。
希望改进的方向
如果维护者认可这个方向,我希望写一个 PR 来支持 browser 栈的显式 opt-in lazy startup。
一个可能的设计是:
- 增加一个显式环境变量,例如:
DISABLE_BROWSER_STACK=true
或者:
AUTOSTART_BROWSER_STACK=false
-
默认行为保持不变。也就是说,用户不设置这个变量时,browser 栈仍然像现在一样随 sandbox 启动。
-
当启用 lazy startup 时,不自启动这些 supervisor program:
tigervnc
openbox
browser
mcp-server-browser
websocat
- 增加一个内部的
ensure_browser_stack_started() 机制,只允许启动上述固定 allowlist 服务,并等待它们 ready。
ready 检查可以包括:
- CDP
/json/version 可访问。
- browser MCP 端口可访问。
- 如果启用 VNC,则 VNC/websocket proxy 已运行。
- 在依赖 browser 的入口里调用 ensure,例如 browser screenshot/actions/page API,以及 MCP browser 工具入口。
这对 DeerFlow 的帮助
这样 DeerFlow 在高并发部署时,可以保留更多低内存的空闲 sandbox;只有真正需要 browser 工具的任务才启动 browser 栈。
等 AIO sandbox 支持这个能力之后,DeerFlow 可以再做一个很小的 follow-up PR,把 SANDBOX_DISABLE_BROWSER_STACK 之类的配置从 provisioner/compose 传进 sandbox Pod。
想请教的问题
- browser 栈按需启动这个方向是否可以接受?
- 镜像里 supervisor/browser 服务配置相关的源码是否可以外部贡献?
- 如果可以,PR 应该基于哪些文件或分支?
- 维护者更倾向于一个统一的
DISABLE_BROWSER_STACK 开关,还是 browser、browser MCP、VNC 分开的多个开关?
中文版本在下面
Background
I ran into a sandbox memory issue while investigating bytedance/deer-flow#3213. DeerFlow uses the AIO sandbox image (
enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest) as its containerized sandbox runtime.The current all-in-one model is very convenient, but in high-concurrency deployments the idle memory cost per sandbox becomes a bottleneck. In my local DeerFlow/kind test environment, the memory profile looked like this:
769-895MiDISABLE_JUPYTER=trueandDISABLE_CODE_SERVER=true: about413Mi143-178MiThe browser-related services are therefore the largest remaining idle-memory component after Jupyter and code-server are disabled.
Local finding
Inside the running AIO sandbox container, I found these supervisor programs:
browsermcp-server-browsertigervncopenboxwebsocatManually stopping these services significantly reduced idle memory. Manually starting them again worked in my local test:
After startup:
http://127.0.0.1:9222/json/versionbecame ready in about 1 second.mcp-server-browserlistened on port8100.This suggests that a lazy-start browser stack could reduce idle memory while preserving browser functionality for workloads that actually need it.
Proposed improvement
I would like to contribute a PR to support opt-in lazy startup for the browser stack, if this is an acceptable direction.
A possible design:
or:
Keep the current default behavior unchanged. The browser stack should still start by default unless users explicitly opt in to lazy startup.
When lazy startup is enabled, do not autostart these supervisor programs:
tigervncopenboxbrowsermcp-server-browserwebsocatensure_browser_stack_started()mechanism that starts only this fixed allowlist of services and waits until they are ready.Readiness checks could include:
/json/versionis reachable.Why this helps DeerFlow
For DeerFlow users, this would allow high-concurrency deployments to keep many idle sandboxes around with much lower memory usage, while still allowing browser workloads to start the browser stack on demand.
After this is supported in AIO sandbox, DeerFlow can add a small follow-up PR to pass through a
SANDBOX_DISABLE_BROWSER_STACK-style setting from its provisioner/compose config.Questions
DISABLE_BROWSER_STACKswitch, or separate switches for browser, browser MCP, and VNC?背景
我在排查 bytedance/deer-flow#3213 时遇到了 AIO sandbox 的内存占用问题。DeerFlow 当前使用 AIO sandbox 镜像(
enterprise-public-cn-beijing.cr.volces.com/vefaas-public/all-in-one-sandbox:latest)作为容器化 sandbox runtime。all-in-one 模式很方便,但在高并发部署里,单个 sandbox 的空闲内存会成为瓶颈。我在本地 DeerFlow/kind 环境中测到的结果大致是:
769-895MiDISABLE_JUPYTER=true和DISABLE_CODE_SERVER=true后约413Mi143-178Mi也就是说,在关闭 Jupyter 和 code-server 之后,browser 相关服务是剩下最大的空闲内存来源。
本地发现
在运行中的 AIO sandbox 容器里,我看到了这些 supervisor program:
browsermcp-server-browsertigervncopenboxwebsocat手动停止这些服务后,空闲内存明显下降。之后我本地手动重新启动它们也可以正常恢复:
启动后:
http://127.0.0.1:9222/json/version约 1 秒 ready。mcp-server-browser在8100端口可用。这说明 browser 栈按需启动是可行的:空闲时降低内存,需要 browser 能力时再启动。
希望改进的方向
如果维护者认可这个方向,我希望写一个 PR 来支持 browser 栈的显式 opt-in lazy startup。
一个可能的设计是:
或者:
默认行为保持不变。也就是说,用户不设置这个变量时,browser 栈仍然像现在一样随 sandbox 启动。
当启用 lazy startup 时,不自启动这些 supervisor program:
tigervncopenboxbrowsermcp-server-browserwebsocatensure_browser_stack_started()机制,只允许启动上述固定 allowlist 服务,并等待它们 ready。ready 检查可以包括:
/json/version可访问。这对 DeerFlow 的帮助
这样 DeerFlow 在高并发部署时,可以保留更多低内存的空闲 sandbox;只有真正需要 browser 工具的任务才启动 browser 栈。
等 AIO sandbox 支持这个能力之后,DeerFlow 可以再做一个很小的 follow-up PR,把
SANDBOX_DISABLE_BROWSER_STACK之类的配置从 provisioner/compose 传进 sandbox Pod。想请教的问题
DISABLE_BROWSER_STACK开关,还是 browser、browser MCP、VNC 分开的多个开关?