How to get the pipe performance of $ foo | bar with Bun.spawn?
#26343
-
|
Hello all - I'm not sure if I've discovered a bug/memory leak, or if things are working as intended and I'm just using something wrong. I'm currently writing scripts that pipe some ffmpeg + ffplay commands together, and to my surprise, I started getting OOM errors and process kills due to resource exhaustion. Looking a little further, it appears to be related to my use of Example that OOMs + shows ~3% CPU usage in Bun alone (presumably due to memory copies?) const command_args = ['ffmpeg' , '-i', 'foo', ...]
const ffmpeg = Bun.spawn(command_args)
const {exited} = Bun.spawn(['ffplay', '-'], {stdin: ffmpeg.stdout})
await exitedExample that doesn't OOM + shows 0% CPU usage in Bun I'd honestly expect these two to be equivalent under the hood, but that doesn't appear to be the case. Is there a way to optimize the former to behave like the latter? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
This is happening because Method A ( The Technical Difference
The FixYou can keep the native performance of const input = "foo.mp4";
const args = ["-i", input, ...otherFlags]; // Your dynamic args
// This uses native pipes (0% JS CPU overhead)
// Bun automatically escapes the array elements
await $`ffmpeg ${args} - | ffplay -`; |
Beta Was this translation helpful? Give feedback.
-
|
Okay, this seems to be a limitation of the existing API, since the answer is effectively “use a different API”. Thanks for the details though, I didn’t know that Bun would auto expand the parameter array like that. It’s enough to get me off the ground. |
Beta Was this translation helpful? Give feedback.
This is happening because Method A (
Bun.spawn) routes the data through the JavaScript runtime, whereas Method B ($) uses a direct OS-level pipe.The Technical Difference
Bun.spawnPiping: When you passstdin: ffmpeg.stdout, Bun currently treatsffmpeg.stdoutas a JavaScriptReadableStream. This means every chunk of video data is allocated in JS memory, passed through the event loop, and then written toffplay. For high-throughput binary data (like ffmpeg), the Garbage Collector cannot keep up with the allocation rate, leading to the OOM and CPU spike.$Shell: This sets up the pipe at the file descriptor level (Kernel Space) before the processes start. The data flows directly from proces…