-
-
Notifications
You must be signed in to change notification settings - Fork 444
Refactor bytecode representation #4220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test262 conformance changes
|
This is not finished in any way. I just wanted to put it out to get some feedback on the overal concept. To get an overview / feeling for the changes:
|
e45496f
to
55c225e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really liking the direction of this PR, simplifying the arguments and generating the opcode functions is a great step forward. 😄
That said, I did notice that the bytecode size increases by about 3x (based on checking the combined.js output). While some of that overhead might be reduced by encoding multiple instructions into a single one, it's still likely to end up at least twice as large overall. Additionally, splitting the bytecode into two arrays could have a negative impact on cache locality.
Overall, I think the approach we're taking is aligned with what engines like V8 and JavaScriptCore are doing. There's a great article from the JavaScriptCore team that touches on a similar idea with prefix opcodes: A new bytecode format for JavaScriptCore.
In terms of performance, I suspect the bigger issue isn't so much unaligned reads, but rather how we read arguments — currently it's done one at a time, with a bounds check on each access. We might see a noticeable performance boost if we check bounds ahead of time and read the arguments in bulk.
Would love to hear your thoughts! :)
EDIT: Here is the code I used to get the size :)
55c225e
to
a42bb06
Compare
I think that is probably the case. I switched back to the u8 bytecode format and I still get a slight performance increase in my first tests. Switching back to the u8 format was much easier with the new macro based opcode emit functions and the new opcode execution dispatching. So I think just for that it would make sense to merge this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work! Really love the removal of the execute_*
functions with arguments definitions and the type-safe compiler functions 🤩
Just had some comments on how we could improve it, see if they make any sense, the rest looks good to me :)
This PR resolves #2561 🥳 |
@@ -0,0 +1,736 @@ | |||
use thin_vec::ThinVec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (non-blocking): add some unit tests here or an issue should be opened to add tests for the unsafe code here.
Still working my reading through this, so I'm familiar with it. 😆 It may come up in the future if we have some unsafe
code here but no unit tests in the module. I wouldn't block merging this on lack of tests, but if none are added, we should at least open up an issue. Plus, adding the unit tests themselves could make a good first issue for someone looking to contribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. I will open an issue after the merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the nitpick, this looks perfect to me! 😄
This PR changes the bytecode encoding / decoding, the bytecode emitting and the opcode execution logic.
In addition to this change in bytecode encoding, I took the chance to add two further changes:
emit
functions for every opcode that can be used in the bytecompiler to get rid of error prone manualemit
code.CompletionType
s into the opcode code itself. This results in theCompletionType
enum being removed. This moves the handling code out of the hot loop that is iterating trough the opcodes. Also many opcodes can only return a limited possibility of completions. Moving the handling into the opcodes enables a more specific handling, in some cases basically removing any handling at all.