-
Notifications
You must be signed in to change notification settings - Fork 786
fix: add Large Payload Support via Temporary File Transport in call-step-file and node-runner #776
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: add Large Payload Support via Temporary File Transport in call-step-file and node-runner #776
Conversation
|
@riturajFi is attempting to deploy a commit to the motia Team on Vercel. A member of the Team first needs to authorize it. |
packages/core/src/call-step-file.ts
Outdated
| .catch((error) => { | ||
| try { | ||
| // spawn failed before handlers attached — still clean up | ||
| if (tempDir) fs.rmSync(tempDir, { recursive: true, force: true }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can lead to concurrency issues, one process deleting the temp dir with files of other processes
packages/core/src/call-step-file.ts
Outdated
| const cleanupTemp = () => { | ||
| if (!tempDir) return | ||
| try { | ||
| fs.rmSync(tempDir, { recursive: true, force: true }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to make it async, we're having some issues with synchronous methods already that I need to fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @riturajFi!!
I like the approach, but we need a few things before we merge:
- We need to implement it in all runtimes we have in codebase, such as Python and Ruby
- We need automated test to validate it
|
Thanks @sergiofilhowz for your comments! I just want to validate whether the approach is correct. I shall implement this for all the runners (+ write an implementation prompt for future runners) |
057f3f6 to
bad4567
Compare
bad4567 to
91289ea
Compare
|
Hi @sergiofilhowz , implemented the changes you asked for -
|
|
This is good! I'm going to merge soon |
|
@riturajFi thanks for the PR, I want to merge it, but first we need to fix the pipeline issues |
64a8820 to
7385044
Compare
|
@sergiofilhowz can you please review this? I fixed some minor concerns that were there. Can we merge it now? |
| processManager.onProcessClose(async (code) => { | ||
| if (timeoutId) clearTimeout(timeoutId) | ||
| processManager.close() | ||
| await cleanupTemp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will lead to race conditions:
- When two requests happen at the same time, one can delete the folder while the other tries to read it
| const cleanupTemp = async () => { | ||
| if (isCleaning || isCleaned || !tempDir) return | ||
| const dir = tempDir | ||
| tempDir = undefined | ||
| isCleaning = true | ||
|
|
||
| try { | ||
| await fs.promises.rm(dir, { recursive: true, force: true }) | ||
| } catch {} | ||
|
|
||
| isCleaning = false | ||
| isCleaned = true | ||
|
|
||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it needs to delete only the file that was created for the process, otherwise there will be race conditions
e46df4f to
f0a7bea
Compare
packages/core/src/call-step-file.ts
Outdated
| tempDir = path.join(os.tmpdir(), `motia-${uniqueId}`) | ||
| fs.mkdirSync(tempDir, { recursive: true, mode: 0o700 }) | ||
| metaPath = path.join(tempDir, 'meta.json') | ||
| fs.writeFileSync(metaPath, jsonData, { mode: 0o600 }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not create a single file instead?
packages/core/src/call-step-file.ts
Outdated
| let argvPayload = jsonData | ||
| let tempDir: string | undefined | ||
| let metaPath: string | undefined | ||
| // This is for keeping the cleanup idempotent | ||
| let isCleaned: boolean = false | ||
| let isCleaning: boolean = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid usage of let
| // spawn failed before handlers attached — still clean up (async, best-effort) | ||
| if (tempDir) { | ||
| const dir = tempDir | ||
| tempDir = undefined | ||
| void fs.promises.rm(dir, { recursive: true, force: true }) | ||
| .catch((err) => {logger.debug(`temp cleanup: ${dir}: ${err.message ?? err}`)}) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not run it at finally?
14f0fc6 to
55edeb2
Compare
|
📦 This PR is large (>500 lines). Please ensure it has been properly tested. |
| const uniqueId = `${process.pid}-${crypto.randomBytes(6).toString('hex')}` | ||
| const metaPath = path.join(baseTempDir, `meta-${uniqueId}.json`) | ||
| payloadState.tempFilePath = metaPath | ||
| fs.writeFileSync(metaPath, jsonData, { mode: 0o600 }) |
Check warning
Code scanning / CodeQL
Network data written to file Medium
Untrusted data
Write to file system depends on
Untrusted data
Write to file system depends on
Untrusted data
ffd4323 to
7b9535a
Compare
|
Hi @sergiofilhowz , the changes have been implemented according to your suggestion. Can you kindly review this? |
Overview
This PR introduces a robust solution to handle large request payloads (JSON or otherwise) without changing existing data types or introducing environment flags.
Previously, Motia serialized the entire request body into a single JSON string argument passed via argv to the runner. This caused OS-level E2BIG errors for large payloads (typically > 2–8MB).
The new design uses temporary files to transfer large data between the main process and language runners (Node, Python, Ruby), while preserving backward compatibility for smaller payloads.
Key Changes
🧠 call-step-file.ts
Added detection for payload size before spawning the runner.
When data exceeds a threshold (e.g., >1MB):
Writes the full event object to a secure temporary JSON file (in os.tmpdir()).
Passes only the path of this file to the runner via argv.
On completion or error, the temp directory is safely deleted to prevent disk bloat.
Maintains existing behavior for smaller requests to minimize overhead.
⚙️ node-runner.ts
Detects whether the received argument is:
A file path → loads JSON from file, executes handler, and deletes the file.
A JSON string → uses legacy flow (unchanged behavior).
Adds cleanup for temporary files even if the process is interrupted.
Maintains identical interface for user handlers — no breaking changes.
Benefits
✅ Removes OS argument size limitations (supports 100MB+ payloads).
✅ Preserves existing APIs, handlers, and request/response structures.
✅ Reduces memory duplication from repeated JSON serialization.
✅ Ensures secure, auto-cleaned temp directories (mode 0700, file mode 0600).
✅ Fully backward compatible and cross-language ready.
Testing Performed
Verified:
Small payloads (<1MB) → use legacy in-memory mode.
Large JSON payloads (5MB–200MB) → run successfully without E2BIG.
Automatic cleanup of temp files after step completion.
Regression-tested standard steps and event flows.
Next Steps
Add optional streaming support for extremely large binary bodies.
Integrate with multipart/form-data in future versions.
🎥 Demo:
Before -
Screencast.from.2025-10-06.10-33-33.webm
After -
Screencast.from.2025-10-06.10-39-10.webm