-
Notifications
You must be signed in to change notification settings - Fork 75
Ensure log bits are correctly maintained #1169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
470787e
to
ec6ba72
Compare
src/plan/generational/gc_work.rs
Outdated
// Scan objects in the modbuf and forward pointers | ||
let modbuf = std::mem::take(&mut self.modbuf); | ||
GCWork::do_work( | ||
&mut ScanObjects::<E>::new(modbuf, false, WorkBucketStage::Closure), | ||
worker, | ||
mmtk, | ||
) | ||
} else { | ||
// Flip the per-object unlogged bits to "unlogged" state for objects inside the bootimage | ||
#[cfg(feature = "vm_space")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know what was better: to go through the list or bulk set the unlog bits in vm_space.release()
. I thought this was better since it is more general than bulk setting the unlog bits (in case the VM binding is not using side-metadata for the unlog bits).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why the VM space needs to be specially dealt here. Why do we need to set the unlog bits in mature GCs for the VM space? What about other spaces like immortal or LOS, how is the VM space different from those spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. So what we want to do is only reset the unlog bit for objects that are considered live. All ImmixSpace and CopySpace mature objects will set the unlog bit when you copy them over because of the CopySemantics::Mature
. But objects in the VM space (or even immortal and LOS like you mentioned) don't perform copies and so we need to reset their bits iff those objects are live. VM space and immortal space objects are always live so we need to unconditionally reset them, but for LOS we need to reset only when the object is live. So really we're doing this reset at the wrong place and time.
One way to fix it would be to reset the unlog bit for mature LOS objects inside its trace_object
. And then we can reset the unlog bits for VM space and immortal space objects in the ProcessModBuf
work packet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But objects in the VM space (or even immortal and LOS like you mentioned) don't perform copies and so we need to reset their bits iff those objects are live.
Why not do that in the trace_object
method for the policy where we know the object is reached? For ImmixSpace and CopySpace, the log bit is set during copying in trace_object
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right that's what I suggested for LOS objects. But it's not clear to me if ART, for example, will actually end up tracing through all objects in the bootimage or immortal space for a full heap GC given their optimization(s). Maybe that's an ART-specific issue though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative could be to create a new work packet to be executed after the transitive closure (or maybe release?) for major GCs. In that work packet we now have all the correct information about if a current object address is marked or not, and if it is, then we can reset the unlog bit for that object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to unconditionally set the log bit in ProcessModBuf
(no matter nursery or mature GCs), and bulk clear log bits when we release memory for dead objects.
In the current code base, for Immix space, we will set log bits during tracing if unlog_object_when_traced
is true, and will clear log bits in prepare if reset_log_bit_in_major_gc
is true. Only in Sticky Immix, both flags are set to true
. The former is needed, as we 'promote' nursery objects in immix space when we trace it (and we do not have to copy it to 'promote'). But the latter may not be necessary -- we can bulk clear log bits for the dead objects after a GC. For other spaces, we can always bulk clear log bits after a GC for reclaimed memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this warrants a group discussion. We can discuss this in the next group meeting. I'll do a hack for ART in the meantime since this current PR is probably still incorrect because of LOS objects' unlog bits not being reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about cases where we recycle mature LOS objects? I think that is broken as well. If the unlog bit was set for it in a previous GC and it has died since then, if we allocate a new LOS object at the same address, we have a young object with the unlog bit set which is also wrong. We need to clear the unlog bit for dead LOS objects in the sweep here
@wks Looks like OpenJDK actually triggers the assertion 😆 . I think the assertion is sound so this is actually an OpenJDK bug. The assertion is not triggered for ART. |
src/plan/barriers.rs
Outdated
@@ -231,6 +231,7 @@ impl<S: BarrierSemantics> Barrier<S::VM> for ObjectBarrier<S> { | |||
target: Option<ObjectReference>, | |||
) { | |||
if self.log_object(src) { | |||
debug_assert!(src.is_live::<S::VM>(), "{} was logged but is not live", src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_live
doesn't make sense during mutator time. The liveness of an object is only determined after computing transitive closure. During mutator time, all objects that can be accessed are, by definition, live.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except the unlog bit should only be set for mature objects, so they have to be considered alive. Or at the very least have to be marked. I previously had is_marked
there but objects in the bootimage may not actually have mark bits set so it was triggering the assertion incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought I actually agree with you and will remove the debug assert from the write barrier code. I think the debug assert inside the ProcessModBuf work packet is correct though
0282ba5
to
4a29049
Compare
Just fixed it like we discussed in a previous meeting |
e26f37c
to
0753a49
Compare
@@ -275,6 +275,17 @@ impl<VM: VMBinding> VMSpace<VM> { | |||
); | |||
debug_assert!(self.in_space(object)); | |||
if self.mark_state.test_and_mark::<VM>(object) { | |||
// Flip the per-object unlogged bits to "unlogged" state for objects inside the | |||
// bootimage | |||
#[cfg(feature = "set_unlog_bits_vm_space")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can remove the conditional compile feature here since it is likely objects in the VM space will want to have the unlog bit set and the feature was more for initialization
@wks Can you re-review this? |
I assume the "issues for future allocations" means newly allocated (young) objects may have unlog bit set, and may be logged when written into, and we will retain more objects than necessary. But I don't think it is sufficient to ensure this by skipping setting unlog bits in mature GCs. There may be dead mature objects that are not written to (i.e. unlogged). When those objects die, their unlog bits will remain set. If future allocations reuse the memory space of those objects, some young objects will also have unlog bits set. If you want to ensure nursery objects never get logged, we may need to (bulk) clear unlog bits for reclaimed regions, like how we clear VO bits. I think this only affects StickyImmix because we never set the unlog bits for objects in the GenCopy or GenImmix nursery in the first place. |
Hence why we do a bulk clear? |
Oh. How odd. This PR doesn't do the bulk clear. I must have messed up when cherry-picking from my branch |
Instead of bulk clearing, it is also possible to copy the mark bits over to the unlog bits (and remove p.s. I think the so-called "unlog bit" should really be named "is candidate for logging". Obviously nursery objects are not such candidates. |
What that suggests is we need to be more systematic with how we deal with stale metadata values |
This fixes a bug wherein we were erroneously resetting the unlog bits for objects in the modbuf for full-heap GCs. If we reset the unlog bits in a full-heap GC, we may end up setting the unlog bit for an object that actually died which may cause issues for future allocations. We now set the unlog bits for mature objects when tracing for VM space (if compiled with `set_unlog_bits_vm_space`), immortal space, and large object space. This PR also adds debug assertions checking that any object that has been added to the modbuf is considered "live" by MMTk, or in other words, it is a mature object.
5e37a4b
to
2b32380
Compare
@wks What needs to be done for this PR to get it to merge? I think it's very important to merge bug fixes ASAP since it leads to duplication of debugging efforts otherwise. This is a very very subtle bug. EDIT: I can remove the |
If I understand correct, the issue is that for dead objects, we do not clear their unlog bits, and new objects allocated will have random unlog bits. The obvious fix would be to just clear the unlog bits before new allocation. The PR tries to maintain the unlog bits before the objects become dead: if an object dies, its unlog bit is 0, and if an object lives, its unlog bit is 1. This doesn't sound like a straight solution to the problem. What is the reason for using this solution rather than just clearing the unlog bits? |
Fast-path allocation. EDIT: It seems like I haven't cherry-picked a further change from here: k-sareen@8f6fcd8 The above commit bulk clears the unlog bits for a global GC and then sets them again while tracing. |
Really what we want is a meta-module that takes care of all relevant metadata when an object dies/a space is released/a TLAB is allocated. What we have currently is a hack since each metadata (like VO-bit or side forwarding bits) has been special cased. But I feel like that is out of scope of this PR. |
We now: - Clear the unlog bits for dead LOS objects - Bulk clear the unlog bits for MarkSweepSpace for full heap GC - Set the unlog bits for LOS and MarkSweepSpace objects during trace object This is because we want to ensure the unlog bits are only set for objects that are alive. We do not reset the unlogs bits inside the `ProcessModBuf` work packet because we do not know what objects are alive at that moment in time.
One more bug: In GenImmix, diff --git a/src/plan/generational/immix/global.rs b/src/plan/generational/immix/global.rs
index ba2850b3a4..207fde4f2c 100644
--- a/src/plan/generational/immix/global.rs
+++ b/src/plan/generational/immix/global.rs
@@ -252,7 +252,7 @@ impl<VM: VMBinding> GenImmix<VM> {
reset_log_bit_in_major_gc: false,
// We don't need to unlog objects at tracing. Instead, we unlog objects at copying.
// Any object is moved into the mature space, or is copied inside the mature space. We will unlog it.
- unlog_object_when_traced: false,
+ unlog_object_when_traced: true,
// In GenImmix, young objects are not allocated in ImmixSpace directly.
#[cfg(feature = "vo_bit")]
mixed_age: false, Setting It should be unnecessary for StickyImmix, too, because this PR now clear unlog bits when freeing lines. I think I made a mistake to think there are stale unmark bits in StickyImmix, but since |
There can't be stale unlog bits given that ImmixSpaceArgs::reset_log_bit_in_major_gc
is set to true
. This shows we really need a centralized framework to clear the metadata bits.
Yes I totally agree. This is so incredibly confusing that I myself don't like this PR. But we need to have a fix for this bug. The centralized metadata framework is definitely the ideal. |
I am considering a, possibly unlikely, scenario. What if a whole chunk of ImmixSpace is reused for another space, say CopySpace? If we allow stale bits to exist, the bits may end up being carried to a different space. Well, this is impossible for Map64 because the possible address ranges of different spaces never overlap, but is should be possible for Map32, although very unlikely. This not only affects the unlog bits, but also any side metadata that allow stale values to exist. |
Currently ImmixSpace uses |
Well this is something the metadata framework should handle, right. Clear all metadata for a new chunk when it is (re)allocated. |
I think this is actually happening for GenCopy. Because the nursery, the from-space and the to-space are all CopySpace, they do return chunks back to VMMap, and some stale unlog bits are carried over to the nursery. |
The following patch is sufficient to fix CopySpace. Just leave no stale log bits behind when releasing. (p.s. I wonder why this has been working for years.) diff --git a/src/policy/copyspace.rs b/src/policy/copyspace.rs
index ddf8e5bb04..bbc66ccd4d 100644
--- a/src/policy/copyspace.rs
+++ b/src/policy/copyspace.rs
@@ -196,6 +196,12 @@ impl<VM: VMBinding> CopySpace<VM> {
side_forwarding_status_table.bzero_metadata(start, size);
}
+ if self.common.needs_log_bit {
+ if let MetadataSpec::OnSide(side) = *VM::VMObjectModel::GLOBAL_LOG_BIT_SPEC {
+ side.bzero_metadata(start, size);
+ }
+ }
+
// Clear VO bits because all objects in the space are dead.
#[cfg(feature = "vo_bit")]
crate::util::metadata::vo_bit::bzero_vo_bit(start, size); |
This patch can assert there is no stale log bits when a bump allocator acquires a page. This is too aggressive, and we probably don't want this to be merged. diff --git a/src/util/alloc/bumpallocator.rs b/src/util/alloc/bumpallocator.rs
index a436b2e07a..73301bab33 100644
--- a/src/util/alloc/bumpallocator.rs
+++ b/src/util/alloc/bumpallocator.rs
@@ -1,13 +1,13 @@
use std::sync::Arc;
-use crate::util::Address;
+use crate::util::{Address, ObjectReference};
use crate::util::alloc::Allocator;
use crate::policy::space::Space;
use crate::util::conversions::bytes_to_pages_up;
use crate::util::opaque_pointer::*;
-use crate::vm::VMBinding;
+use crate::vm::{ObjectModel, VMBinding};
/// Size of a bump allocator block. Currently it is set to 32 KB.
const BLOCK_SIZE: usize = 8 << crate::util::constants::LOG_BYTES_IN_PAGE;
@@ -209,6 +209,16 @@ impl<VM: VMBinding> BumpAllocator<VM> {
block_size,
acquired_start
);
+
+ for addr in (acquired_start.as_usize()..acquired_start.as_usize() + block_size).step_by(8) {
+ let objref = unsafe {
+ ObjectReference::from_raw_address_unchecked(Address::from_usize(addr))
+ };
+ let unlogged = <VM as VMBinding>::VMObjectModel::GLOBAL_LOG_BIT_SPEC
+ .is_unlogged::<VM>(objref, std::sync::atomic::Ordering::Relaxed);
+ assert!(!unlogged);
+ }
+
if !stress_test {
self.set_limit(acquired_start, acquired_start + block_size);
self.alloc(size, align, offset) |
Thanks. And yes, I wonder the same... This is a very non-trivial issue. We really should prioritize a better internal API for this. |
@wks The assertion failure comes from |
The bug is in let size = if self.pr.cursor().chunk_index() == start.chunk_index() {
self.pr.cursor() - start The intention is that if the cursor of the PR is within the same chunk, it will only scan up to that cursor. However, there is a corner case.
So we should fix the iterator of the MonotonicPageResource. |
Wow. Thank you for catching that. Is the solution just to do |
I made a PR for this. #1299 I just used the simplest solution. If it is intended for optimizing for the case that "the cursor is within that region", we just test if the cursor is within the region by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor comment problem. It looks good otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@wks Do you know what commit details are used for the rebase commit? The details from the first one? Or does it use the description of the PR? I just wanted to clean up some stuff in the description before I merged it. |
When we merge this PR, the commit message will use the title of this PR as the first line, followed by the description of this PR (i.e. the content of #1169 (comment)). You can edit them anytime before merging. |
Thank you. I've edited the description and title of the PR. If you think it's sufficient, please feel free to merge it. Thank you again for all your debugging help. |
This fixes a bug wherein we were erroneously resetting the unlog bits
for objects in the modbuf for full-heap GCs. If we reset the unlog bits
in a full-heap GC, we may end up setting the unlog bit for an object
that actually died which may cause issues for future allocations.
In major GCs, we now bulk clear unlog bits for mark-sweep space and
immix space during the prepare stage and bulk clear them for copyspace
during the release stage. We clear log bits for large objects during the
sweep. We also set the unlog bits for mature objects when tracing. This
ensures that there are no stale unlog bits and that we don't accidentally
log nursery objects.
This PR also adds debug assertions checking that any object that has
been added to the modbuf is considered mature by MMTk.