feat(backend): sandbox snapshots & volumes for nsjail#7992
feat(backend): sandbox snapshots & volumes for nsjail#7992rubenfiszel wants to merge 23 commits intomainfrom
Conversation
Add support for custom rootfs snapshots and persistent volumes in nsjail
sandboxes, enabling users to run scripts with pre-built environments and
shared persistent storage.
- DB migration: sandbox_snapshot + sandbox_volume tables
- API: CRUD endpoints under /api/w/{ws}/sandbox/
- Common: annotation parser for # sandbox: and # volume: directives
- Worker: S3-backed snapshot cache, overlayfs mount, volume sync
- Executors: nsjail config post-processing to swap rootfs when overlay active
- Tests: 28 tests covering parsing, config generation, and end-to-end flows
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Claude finished @rubenfiszel's task in 3m 13s —— View job Review: Sandbox Snapshots & Volumes for nsjail
General ObservationsArchitecture is clean and well-structured. The separation of concerns is good: annotation parsing in Test coverage is solid. 14 parser tests + 14 worker tests including 4 end-to-end tests against real proto templates is thorough. The tests cover edge cases well (mixed comment styles, nested colons, no annotations passthrough). Executor changes are consistent and mechanical. All 12 nsjail-using executors are updated identically — wrapping the config string through Issues Found (10 inline comments posted)High Priority:
Medium Priority: Low Priority / Suggestions: What Looks Good
Test InstructionsThis feature is entirely backend/API-driven with no frontend UI. To verify:
|
| use flate2::read::GzDecoder; | ||
| use std::io::Cursor; | ||
| use tar::Archive; | ||
|
|
There was a problem hiding this comment.
Ensure that tar archive extraction (via Archive::unpack) safely handles path traversal. Validate that entries do not escape the intended extraction directory.
| work.display() | ||
| ); | ||
|
|
||
| let output = Command::new("mount") |
There was a problem hiding this comment.
Double-check that inputs used to build the mount options (e.g. snapshot_path) are properly sanitized to prevent injection vulnerabilities in the mount command.
backend/windmill-worker/Cargo.toml
Outdated
| tar.workspace = true | ||
| flate2.workspace = true | ||
| tempfile.workspace = true | ||
| object_store = { workspace = true, optional = true} |
There was a problem hiding this comment.
Typographical consistency: The object_store dependency has a missing space before the closing brace (i.e. { workspace = true, optional = true}) compared to the other dependency definitions. Consider updating it to { workspace = true, optional = true } for consistency.
Deploying windmill with
|
| Latest commit: |
9a06b26
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://251e8042.windmill.pages.dev |
| Branch Preview URL: | https://feat-sandbox-snapshots-volum.windmill.pages.dev |
| ) -> error::Result<String> { | ||
| require_admin(authed.is_admin, &authed.username)?; | ||
| let s3_key = format!( | ||
| "sandbox/snapshots/{}/{}/{}.tar.gz", | ||
| w_id, body.name, body.tag |
There was a problem hiding this comment.
Security: No input validation on name, tag, or docker_image
The name and tag fields are used directly in S3 key construction (format!("sandbox/snapshots/{}/{}/{}.tar.gz", w_id, body.name, body.tag)). A malicious admin could inject path traversal characters (e.g., ../../../) to write to arbitrary S3 paths, or use special characters that cause issues with the filesystem cache path construction in ensure_snapshot_cached.
Similarly, docker_image is passed directly to crane export in build_snapshot — while this is admin-only, it would be defensive to validate the format.
Consider adding validation:
fn validate_resource_name(name: &str) -> error::Result<()> {
if name.is_empty() || name.len() > 128
|| !name.chars().all(|c| c.is_alphanumeric() || c == '-' || c == '_' || c == '.') {
return Err(Error::BadRequest("Invalid name: must be 1-128 chars, alphanumeric/dash/underscore/dot only".into()));
}
Ok(())
}| let cache_key = if row.content_hash.is_empty() { | ||
| format!("{w_id}_{name}_{tag}") | ||
| } else { | ||
| row.content_hash.clone() | ||
| }; | ||
| let cache_path = PathBuf::from(SNAPSHOT_CACHE_DIR).join(&cache_key); | ||
|
|
||
| if cache_path.exists() { | ||
| tracing::info!("Snapshot {name}:{tag} found in cache at {}", cache_path.display()); | ||
| return Ok(cache_path); |
There was a problem hiding this comment.
Bug: TOCTOU race on cache directory
There's a race condition here: between checking cache_path.exists() (line 163) and the download/unpack, another worker could be downloading the same snapshot concurrently. This could result in a partially-unpacked snapshot being served to one worker while another is still writing.
Consider using a lockfile or atomic rename pattern (download to a temp dir, then rename into the cache path).
| "Downloading snapshot {name}:{tag} from S3 key: {}", | ||
| row.s3_key | ||
| ); | ||
| let bytes = attempt_fetch_bytes(os, &row.s3_key).await?; |
There was a problem hiding this comment.
Performance: Entire snapshot loaded into memory
attempt_fetch_bytes loads the entire tar.gz into memory before unpacking. For large Docker images (e.g., Python ML images can be 1-5+ GB compressed), this could cause OOM on workers with limited memory.
Consider streaming the download directly to disk or using a streaming decompressor. This may not be critical for v1 but worth noting for large snapshot use cases.
| for (_name, (local_dir, mount_path)) in &setup.volume_mounts { | ||
| mounts.push_str(&format!( | ||
| "\nmount {{\n src: \"{}\"\n dst: \"{}\"\n is_bind: true\n rw: true\n}}\n", |
There was a problem hiding this comment.
Security: Volume mount path not validated
The mount_path originates from user-provided script annotations (e.g., # volume: data:/workspace/data). While nsjail itself provides sandboxing, the mount_path string is injected directly into the nsjail config without validation. A carefully crafted path could potentially escape the intended mount namespace or overwrite critical nsjail mount points.
Consider validating that mount_path:
- Is an absolute path
- Does not contain
..components - Does not target sensitive directories like
/proc,/sys,/dev
| } | ||
|
|
||
| #[cfg(feature = "parquet")] | ||
| const SNAPSHOT_CACHE_DIR: &str = "/tmp/windmill/snapshots"; |
There was a problem hiding this comment.
Potential issue: Hardcoded cache path
/tmp/windmill/snapshots is hardcoded but could conflict with the existing TMP_DIR constant used elsewhere in the worker. Also, there's no cache eviction strategy — over time this directory will grow unboundedly as new snapshot versions are built.
Consider:
- Using a path derived from existing config constants
- Adding a cache size limit or LRU eviction
| if let Some(spec) = content.strip_prefix("sandbox:").map(|s| s.trim()) { | ||
| let (name, tag) = spec | ||
| .split_once(':') | ||
| .map(|(n, t)| (n.to_string(), t.to_string())) | ||
| .unwrap_or_else(|| (spec.to_string(), "latest".to_string())); | ||
| config.snapshot = Some(SnapshotRef { name, tag }); | ||
| } else if let Some(spec) = content.strip_prefix("volume:").map(|s| s.trim()) { | ||
| if let Some((name, path)) = spec.split_once(':') { |
There was a problem hiding this comment.
Bug: Ambiguous parsing for # sandbox: with colons in tag names
The parser uses split_once(':') on the spec after sandbox:. This means a sandbox name like registry.example.com/image:v1 would be split incorrectly — name = "registry.example.com/image" and tag = "v1", which could be confusing depending on intent.
More importantly, the volume parsing at line 82 uses split_once(':') too, so # volume: name:/path works correctly. But the sandbox parsing treats sandbox: name:tag where the first colon separates name from tag — this means sandbox names cannot contain colons. Worth documenting this constraint explicitly.
| } | ||
|
|
||
| for (vol_name, mount_path) in &sandbox_config.volumes { |
There was a problem hiding this comment.
Note: std::fs::create_dir_all (sync) used in async context
This uses the synchronous std::fs::create_dir_all instead of tokio::fs::create_dir_all. While it's a minor filesystem op that likely won't block the runtime noticeably, the rest of the codebase in sandbox_setup.rs consistently uses tokio::fs::create_dir_all. Would be cleaner to use the async version here too for consistency.
|
|
||
| if let Some(ref setup) = sandbox_setup { | ||
| if let Some(db) = conn.as_sql() { | ||
| for (vol_name, (local_dir, _)) in &setup.volume_mounts { | ||
| if let Err(e) = crate::sandbox_setup::upload_volume( | ||
| &job.workspace_id, | ||
| vol_name, | ||
| local_dir, | ||
| db, | ||
| ) | ||
| .await | ||
| { | ||
| tracing::error!("Failed to upload volume {vol_name}: {e}"); | ||
| } | ||
| } | ||
| } | ||
| if let Some(ref overlay) = setup.overlay { | ||
| crate::sandbox_setup::unmount_overlay(overlay).await.ok(); |
There was a problem hiding this comment.
Reliability: Volume upload errors are logged but don't fail the job
If volume upload fails, the error is logged with tracing::error! but the job still succeeds. This is a reasonable design choice for resilience, but could lead to silent data loss if users depend on volume persistence. Consider whether this should be configurable or at minimum more prominently surfaced (e.g., appended to job logs visible to the user).
| docker_image TEXT NOT NULL, | ||
| setup_script TEXT, | ||
| size_bytes BIGINT, | ||
| status VARCHAR(20) NOT NULL DEFAULT 'pending', |
There was a problem hiding this comment.
Suggestion: Use an ENUM or CHECK constraint for status
The status column is VARCHAR(20) with no constraint on allowed values. The code uses 'pending', 'building', 'ready', 'failed' — consider using a CHECK constraint to enforce valid states:
| status VARCHAR(20) NOT NULL DEFAULT 'pending', | |
| status VARCHAR(20) NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'building', 'ready', 'failed')), |
| let s3_key = match row { | ||
| Some(r) => r.s3_key, | ||
| None => { | ||
| let s3_key = format!("sandbox/volumes/{w_id}/{name}.tar.gz"); | ||
| sqlx::query!( | ||
| "INSERT INTO sandbox_volume (workspace_id, name, s3_key, created_by) \ | ||
| VALUES ($1, $2, $3, 'system') ON CONFLICT DO NOTHING", | ||
| w_id, | ||
| name, | ||
| &s3_key, | ||
| ) | ||
| .execute(db) | ||
| .await?; | ||
| tracing::info!("Auto-created volume {name} in workspace {w_id}"); | ||
| return Ok(()); |
There was a problem hiding this comment.
Design: Auto-creating volumes from worker is surprising
When a script references # volume: foo:/path but the volume doesn't exist in the DB, the worker silently auto-creates it with created_by = 'system'. This bypasses the admin-only restriction on the API layer. A non-admin user could effectively create volume records by adding annotations to their scripts.
Consider requiring that volumes be pre-created via the admin API, and returning an error here instead of auto-creating.
…tegration tests Refactor S3-dependent functions into testable core helpers that accept Arc<dyn ObjectStore>, enabling real integration tests with InMemory object store instead of requiring a live S3/MinIO instance. - Extract: tar_gz, untar_gz, fetch_bytes_from_store, put_bytes_to_store, upload_volume_to_store, download_volume_from_store - Add 6 S3 integration tests using object_store::memory::InMemory: volume upload/download roundtrip, overwrite semantics, snapshot tar lifecycle, missing key handling, put/fetch bytes roundtrip - Rename misleading "e2e" tests to "config_pipeline" tests - Total: 36 tests (14 parser + 22 worker) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ordering
Add 6 nsjail integration tests that verify actual sandbox execution:
- basic bash execution
- volume read (bind-mount data into sandbox)
- volume write-back (sandbox writes to bind-mounted volume)
- overlay root (snapshot rootfs as sandbox root)
- overlay root + volume mount combined
- result.json output from sandbox
Fix a bug in finalize_nsjail_config where the overlay root mount
(dst: "/") was placed last in the config (via {SHARED_MOUNT}). nsjail
applies mounts in order, so a root bind mount after tmpfs/bind mounts
would overwrite them. The root mount is now moved to be the first mount
block, ensuring subsequent mounts layer correctly on top.
Also add 2 #[ignore] tests for mount_overlay (needs root) and
build_snapshot (needs crane CLI).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace placeholder #[ignore] tests with real integration tests: - overlay_integration: test overlayfs read/write semantics via unshare (no root required), and overlayfs rootfs as nsjail sandbox root - crane_integration: test crane export of alpine image → tar/gz roundtrip, and running the extracted rootfs in nsjail Also fix tar_gz to preserve symlinks (follow_symlinks=false) instead of following them to absolute paths that may not exist on the host. All tests skip gracefully when tools (nsjail, crane, unshare) are unavailable, so they won't break CI. Total: 32 passing (with parquet), 14 parser tests = 46 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…directly mount_overlay now tries kernel `mount -t overlay` first (needs root or CAP_SYS_ADMIN), then falls back to fuse-overlayfs (works unprivileged). unmount_overlay uses fusermount3 -u for FUSE mounts, umount for kernel. OverlayMount gains an `is_fuse` field to track which backend was used. The overlay integration tests now exercise the actual mount_overlay() and unmount_overlay() Rust functions (via fuse-overlayfs on non-root), verifying read-through, copy-on-write, and cleanup semantics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit updates the EE repository reference after PR #420 was merged in windmill-ee-private. Previous ee-repo-ref: 931813b75b8260faa13ddc07f36a11607b7e3bf6 New ee-repo-ref: 592848d59ca2304926fb2bd85d000668a7f46a77 Automated by sync-ee-ref workflow.
|
🤖 Updated |
Move nsjail config, overlayfs, tar utilities, S3 operations, and sandbox types from windmill-common and windmill-worker into a dedicated windmill-sandbox crate. S3 logic lives in windmill-ee-private (symlinked s3_ee.rs). Both windmill-api and windmill-worker depend on the new crate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ing state (#7987) * fix(frontend): add folder name validation and error handling to folder picker Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(frontend): add loading state to folder picker select Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(frontend): add error toast for folder list loading failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(frontend): reassign userStore folders array to trigger reactivity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* chore(main): release 1.638.4 * Apply automatic changes --------- Co-authored-by: rubenfiszel <275584+rubenfiszel@users.noreply.github.com>
* feat: replace native select with custom Select in FolderPicker Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * chore: update ee-repo-ref to 592848d59ca2304926fb2bd85d000668a7f46a77 This commit updates the EE repository reference after PR #420 was merged in windmill-ee-private. Previous ee-repo-ref: 931813b75b8260faa13ddc07f36a11607b7e3bf6 New ee-repo-ref: 592848d59ca2304926fb2bd85d000668a7f46a77 Automated by sync-ee-ref workflow. * nit * fix(frontend): edit button in folder picker dropdown should not select the item Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: restore ee-repo-ref.txt to match main Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(frontend): clean up FolderPicker review nits Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update frontend/src/lib/components/FolderPicker.svelte Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: windmill-internal-app[bot] <windmill-internal-app[bot]@users.noreply.github.com> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
* perf(cli): skip relock more accurate Signed-off-by: pyranota <pyra@duck.com> * Update cli/src/utils/metadata.ts Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> * Update cli/src/commands/flow/flow_metadata.ts Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> * fix Signed-off-by: pyranota <pyra@duck.com> * use structuredClone for safety Signed-off-by: pyranota <pyra@duck.com> * chore: update ee-repo-ref to 592848d59ca2304926fb2bd85d000668a7f46a77 This commit updates the EE repository reference after PR #420 was merged in windmill-ee-private. Previous ee-repo-ref: 931813b75b8260faa13ddc07f36a11607b7e3bf6 New ee-repo-ref: 592848d59ca2304926fb2bd85d000668a7f46a77 Automated by sync-ee-ref workflow. * fix ci Signed-off-by: pyranota <pyra@duck.com> * add simple tests Signed-off-by: pyranota <pyra@duck.com> --------- Signed-off-by: pyranota <pyra@duck.com> Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com> Co-authored-by: windmill-internal-app[bot] <windmill-internal-app[bot]@users.noreply.github.com>
…ackend (#7996) * refactor: extract object store code into windmill-object-store crate with filesystem backend Consolidate all object_store-dependent code from windmill-common into a new windmill-object-store crate. Add a filesystem-backed object store implementation using LocalFileSystem for dev/testing without cloud credentials. Includes 30 comprehensive tests covering render_endpoint, lfs_to_object_store_resource, duckdb_connection_settings, error mapping, and filesystem-backed integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * all * all * all * all * fix: fix raw_app hardcoded path, add missing ObjectStoreResource import, and add tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: move S3ModeFormat to windmill-types, make windmill-parser-sql optional, restore debug logs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * all --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ect object_store dep Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…_domains Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Adds workspace-level sandbox snapshots and persistent volumes for nsjail-sandboxed script execution. This lets users:
python:3.12withpip install pandas numpy), stored on S3, mounted via overlayfs as the nsjail rootfs. Scripts reference them with# sandbox: python-env:latestannotations.# volume: data:/workspace/dataannotations.Architecture
Key design decisions
shared_mountstring; nsjail config post-processed viafinalize_nsjail_config()to swap system dir mounts for overlay rootparquetfeature: matches existing object store infrastructureChanges
migrations/20260217*sandbox_snapshot+sandbox_volumetableswindmill-common/src/sandbox.rsparse_sandbox_config()annotation parserwindmill-api/src/sandbox.rswindmill-worker/src/sandbox_setup.rsworker.rsfinalize_nsjail_config()Example usage
Python script with custom snapshot + persistent volume:
Bash script with custom runtime:
TypeScript/Go (// comment style):
Test plan
windmill-common::sandbox::tests) — all comment styles, edge cases, mixed annotationswindmill-worker::sandbox_setup::tests) — including 4 end-to-end tests against real proto templates:cargo check— cleanRUSTFLAGS="-D warnings" cargo check— cleanRUSTFLAGS="-D warnings" cargo check --features parquet— clean🤖 Generated with Claude Code