-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
method to propagate backpressure from parser handler #3218
Comments
To clarify, this isn't about async/sync writes getting out of order, it's about locking up the main thread when writing a lot of data where a handler (csi handler?) is slow to parse. |
Yes, the latter point would be a reason to offload the hard work to a worker (as with the image processing example). But of course this applies to any handler that takes really long. The execution order or state synchronisation is a different issue that would arise from worker usage, if the worker wants to place data back in the terminal state. Then this is a serious issue on its own, but not what I am talking here about. Lemme illustrate this further - given you have some CSI handler, that invokes an async worker interface with portions of the terminal data. This sounds easy to do: const fancyStuff = new Worker('process_fancy_stuff.js');
term.registerCsiHandler({some_symbols}, params => {
// offload hard work
fancyStuff.postMessage(someLongishTermData); // <-- Where does the data end up if worker is slow?
}); This works reliable, as long as it is guaranteed, that the incoming data rate is lower than the data rate of Thats the point where I think pausing the |
Oh the csi handlers are synchronous atm, I think the bit I wasn't expected was the API is this: registerCsiHandler(id: IFunctionIdentifier, callback: (params: (number | number[])[]) => boolean): IDisposable; Not this: registerCsiHandler(id: IFunctionIdentifier, callback: (params: (number | number[])[]) => boolean | Promise<boolean>): IDisposable; Or: registerCsiHandler(
id: IFunctionIdentifier,
callback: (
params: (number | number[])[],
resultCallback: (result: boolean) => void) => void
): IDisposable; The right fix is to block writing completely as above isn't it? That seems more reliable/easier to use than adding a pause/resume mechanism. |
Do you have a use case for a handler being promise / callback result based? If so, then we prolly should approach the issue with real in-band blocking, otherwise we cannot guarantee handler execution order anymore. My use case here is more relaxed in this regard, it does not depend on correct execution order down to a single handler + its result, it only needs the blocking semantics on the data pressure part. Promises would also solve this, but they come to a really high price - we would have to reshape the whole parser chain to async support with promises. Following my speed tests here, promises at least triple the runtime compared to pure sync code (actually it is more like 3-5 times slower depending on the promises "granularity"). But there is hope for a middle ground - those tests also show that async code based on direct microtask callbacks is less painful only doubling the runtime. So if you need true in-order execution async support with blocking semantics, we might be able to get the best of both worlds (sync + async) with that. |
Just had another look at the input chain. The sync part down to individual handlers roughly looks like this:
Thats not a very deep callstack yet, so yeah it prolly can be reshaped into async support with blocking semantics with an exceptable effort. Next question here would be whether to go with promises or with manual restarts. Both have pros and cons: Promises:
manual restarting:
At the current state I am biased towards the manual restart, I really fear a bad drop of the throughput with a pure promise based variant. Ofc this needs testing, still I'd expect promises being 3-5 times slower, and the manual restart somewhere 1.5-2 times slower. There is another point that adds towards the munal restart - Edit: Edit2: |
Did a quick rewrite of both parse methods to promises and awaiting handler invocations. These are the results:
Compared to purely sync exection:
Quite underwhelming - promises on |
I think the following would be possible as optional async interface: registerCsiHandler(
id: IFunctionIdentifier,
callback: (
params: (number | number[])[],
block: () => ((result:boolean) => void)) => void
): IDisposable; and used as something like this: const asyncHandler = term.registerCsiHandler({...}, (params, block) => {
// block() creates the needed state preservation in the background
// and returns a function to continue later from
const unblock = block();
// do something async, unblock when done
new Promise(res => { /* async stuff */ }).then(() => unblock(true));
}); Something like that should be possible with manual resuming of the parser parts without polluting everything with promises. Only a few handlers that might need the async capabilities and call |
A (not so) quick rewrite with manual resuming shows these results: Context "out-test/benchmark/Terminal.benchmark.js"
Context "Terminal: ls -lR /usr/lib"
Context "write"
Case "#1" : 5 runs - average throughput: 16.21 MB/s
Context "writeUtf8"
Case "#1" : 5 runs - average throughput: 17.41 MB/s (Note: As nodejs is missing a native To get an impact on throughput I had to mark Also after some refactoring the handler interface looks pretty easy: registerCsiHandler(
id: IFunctionIdentifier,
callback: (params: (number | number[])[]) => boolean | Promise<boolean>
): IDisposable; Imho those results with manual resuming are quite promising, given that almost every line in the test output contains at least one |
So using the I know we've had the promise discussion before and I'm fine to leave them out because of that, I only mentioned them because the |
Need a few more tests with the conceptual ideas before getting down to a PR. |
I can't provide much technical input as I'm not very familiar with the parser, but I enjoyed watching this issue on its journey over last couple of days. Thanks for sharing all the updates, it really helped me to get an understanding of the technical solution you are striving for. I think your latest iteration looks pretty good. |
Have a prototype up for
Note the higher run count, seems the v8 optimizer has much more trouble to optimize the There are several things complicating a straight forward async support across handler types:
|
First some background on this idea (coming from the image PR):
Currently SIXELs are decoded synchronously and the image decoding can correctly slow down the whole input processing chain, which leads to a drop of callback calls in
WriteBuffer
. Ok - sync processing works.Ofc image decoding creates quite alot fuzz on the main thread and prolly should be done in a worker (for PNG/JPEG it has to be done async anyway). And thats where the data pressure hell begins - the SIXEL decoder has a throughput of ~50 MB/s while the DCS parser path operates at ~250 MB/s, means the terminal floods the decoder with 200 MB more data than it can process in a second (worst case). Currently there is no way to inform the
WriteBuffer
asynchronously about that data pressure condition from within the terminal.Repro:
Doing
cat *.sixel
in a folder full of sixel images with my async decoder triggers the issue. Same withmpv
with their new sixel video output above a certain output size.In general:
This is currently a non-issue for our normal sync handler actions, as sync code always "blocks" and would indicate data pressure by a lower callback rate in
WriteBuffer
. But any async handler code with lower throughput than the parser itself will run into this. It is def. not only a problem of image processing, it just happened to be the first complicated async handler.Idea:
It would be good to have some way to pause/resume the data chunk processing on
WriteBuffer
from a parser handler. Given that flow control is properly set up for the terminal itself, this enables correct backpressure signalling from async parser handler code as well.Up for discussion.
Edit: I dont need this fine-grained to a single input buffer position (although this would solve many issues around async handlers in general, the needed stack unwinding is not worth it), I think our already established chunk border stops are good enough.
The text was updated successfully, but these errors were encountered: