Skip to content

[nexus] Make it 'more default' for Debug datasets to exist in test env #7982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 25, 2025

Conversation

smklein
Copy link
Collaborator

@smklein smklein commented Apr 15, 2025

Before this PR:

  • DiskTest::new did not create any debug datasets
  • omicron-dev run-all did not initialize any disks
  • Any tests wanting debug datasets would need to explicitly request them, then call disk_test.propagate_datasets_to_sleds().
  • Additionally, in this environment, many tests assume "the only tests which exist are crucible datasets".

After this PR:

  • DiskTest::new does creates debug datasets on all zpools
  • omicron-dev run-all initializes disks on the "first sled agent", with datasets
  • Any tests using DiskTest::new will get them by default
  • Tests are adjusted to cope with multiple dataset types

// - DEFAULT_ZPOOL_COUNT zpools, each of which contains:
// - A crucible dataset
// - A debug dataset
DiskTest::new(&cptestctx).await;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an example of where this is useful:

#7972

We can create (and store) support bundles on the simulated system.

@@ -799,8 +801,8 @@ pub(crate) mod test {
async fn test_region_replacement_start_saga(
cptestctx: &ControlPlaneTestContext,
) {
let mut disk_test = DiskTest::new(cptestctx).await;
disk_test.add_zpool_with_dataset(cptestctx.first_sled_id()).await;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't strictly necessary, but IMO it's a little clearer -- we're calling "add_zpool_with_dataset" to make a fourth zpool, on top of the three which already exist.

@smklein smklein marked this pull request as ready for review April 16, 2025 16:18
@smklein smklein requested review from davepacheco and jmpesp April 16, 2025 16:18
@smklein smklein requested a review from leftwo April 23, 2025 21:26
Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great direction. I'm not so familiar with the disk testing stuff but what I understood here looks good.

@@ -209,14 +210,16 @@ mod test {
datastore.clone(),
);

// Record which datasets map to which zpools for later
// Record which crucible datasets map to which zpools for later

let mut dataset_to_zpool: BTreeMap<ZpoolUuid, DatasetUuid> =
BTreeMap::default();

for zpool in disk_test.zpools() {
for dataset in &zpool.datasets {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take it or leave it: it might be useful to make zpool.datasets non-pub and have a method that iterates just the Crucible datasets. That would ensure that we've found all possible uses of datasets that needed to be updated in this PR and also makes it easier in the future (e.g., if we add some other kind of dataset, we won't need to go change all these callers again).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 36d2d04.

I built this originally as an iterator, before realizing: Actually, tests kinda are assuming that we'll have at most one dataset kind per zpool.

So I'm exposing a helper function that provides that lookup of "find a dataset of a specific kind", validating that there is exactly one. If that assumption breaks, tests will break pretty clearly, and we can keep updating the TestZpool implementation.

@smklein smklein enabled auto-merge (squash) April 25, 2025 21:16
Copy link
Contributor

@leftwo leftwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question on a comment

// Also call a similar method to invoke the "omicron_physical_disks_ensure" API. Right now,
// we aren't calling this at all for the simulated sled agent, which only works because
// the simulated sled agent simply treats this as a stored config, rather than processing it
// to actually provide a different view of storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the 2nd part of the old comment no longer applicable because of the update on line 1325 above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think this comment is still accurate -- dataset config is different from physical disk config.

This comment is now saying "we should do something similar to this for disks", even though the simulated sled agent is basically ignoring the provided disks config, for now.

@smklein smklein merged commit 20e3202 into main Apr 25, 2025
16 checks passed
@smklein smklein deleted the omicron-dev-disk-test branch April 25, 2025 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants