Make sure we never context switch while holding VM lock. #735
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We were seeing errors in our application that looked like:
We concluded that there was context switching going on while a thread held the VM lock. During the investigation into the issue, we added assertions that we never yield to another thread with the VM lock held. We enabled these VM lock assertions even in single ractor mode. These assertions were failing in a few places, but most notably in finalizers. We were running finalizers with the VM lock held, and they were context switching and causing this issue.
These rules must be held going forward to ensure we don't context switch unexpectedly:
If you have the VM lock held,
* Don't enter the interpreter loop.
* Don't yield to ruby code.
* Don't call rb_nogvl (it will context switch you and will not unlock the VM lock).
* Don't check your own interrupts, it can switch you.
If you don't have the GVL:
* Don't call rb_ensure/rb_protect, etc (these are old rules but good to have assertions for).