Skip to content

Cosmo support #2013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 111 commits into from
Apr 29, 2025
Merged

Cosmo support #2013

merged 111 commits into from
Apr 29, 2025

Conversation

mkeeter
Copy link
Collaborator

@mkeeter mkeeter commented Feb 24, 2025

I'm opening this early to run it through CI and do preliminary self-review, but don't feel obliged to look at it yet

@mkeeter mkeeter force-pushed the mkeeter/cosmo-support branch 4 times, most recently from 22a84bf to 7c30d67 Compare February 25, 2025 14:59
@mkeeter mkeeter changed the base branch from master to mkeeter/countable-ice40-errors February 25, 2025 14:59
Base automatically changed from mkeeter/countable-ice40-errors to master February 25, 2025 16:18
@mkeeter mkeeter force-pushed the mkeeter/cosmo-support branch from dfbb462 to ab18423 Compare February 25, 2025 21:27
@mkeeter mkeeter force-pushed the mkeeter/cosmo-support branch 3 times, most recently from 73fbbb9 to e9d1098 Compare March 11, 2025 13:50
@mkeeter mkeeter force-pushed the mkeeter/cosmo-support branch from e9d1098 to a25bac1 Compare March 12, 2025 14:14
@mkeeter mkeeter mentioned this pull request Mar 18, 2025
@mkeeter mkeeter force-pushed the mkeeter/cosmo-support branch 2 times, most recently from 2189162 to f8387a6 Compare March 21, 2025 14:23
if !okay {
// We'll return to A2, leaving jefe and our local state
// unchanged (since they're set after this block).
self.log_pg_registers();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to know what seq API state and seq raw state we were in at the point of timeout, in addition to the PGs. Combining these pieces of information can help us pinpoint which rail(s) didn't come up or let us figure out what we are waiting on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I added a call to self.log_state_registers() here as well (and below) in 128d6f9

@mkeeter mkeeter force-pushed the mkeeter/cosmo-support branch from fff65cd to 8190dfe Compare March 24, 2025 20:13
@mkeeter mkeeter marked this pull request as ready for review March 24, 2025 21:16
Copy link
Collaborator

@bcantrill bcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good; two potentially incorrect values plus some questions/nits.


// Bonus bits for M.2 power, which is switched separately. We *cannot*
// read the M.2 drives when they are unpowered; otherwise, we risk
// locking up the I2C bus (see hardware-gimlet#1804 for the gory
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check and see if this is still necessary!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be unnecessary, so I removed it (see oxidecomputer/quartz#321 and https://github.com/oxidecomputer/hardware-cosmo/issues/641)

sensors::NUM_NVME_BMC_TEMPERATURE_SENSORS;

// The control loop is driven by CPU, NIC, and BMC temperatures
// XXX we should also monitor DIMM temperatures here
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanaelhuffman / @Aaron-Hartwig I think we need integration with the main FGPA here, because we'll be reading DIMM temperatures through the I3C proxy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true though may be a post-bringup activity. We've been having a number of issues with the ruby+grapefruit signal integrity to properly prototype this and while I'm still working to improve the test fixture over the next couple of days, we're running out of time before lab pack-up must commence.

task-slots = ["sys"]
notifications = ["spi-irq"]

# XXX this is only used by cosmo_seq; could we merge it?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XXX comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remains a potential future improvement, but I don't want to handle it right now!

Comment on lines 494 to 498
1 => unreachable!(),
2..=8 => "u8",
9..=16 => "u16",
17..=32 => "u32",
_ => panic!("invalid width {width}"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels a little strange to have 1 be unreachable but everything else panic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are meaningfully different, though! We should never hit the 1-bit path, because cases where lsb == msb are handled as bool. I'll add some text to the unreachable!(..) to make that more clear.


use fmc_periph::A0Sm;
match (self.get_state_impl(), state) {
(PowerState::A2, PowerState::A0) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to add the CPU_PRESENT CORETYPE SP5 checks before merging? e.g. https://github.com/oxidecomputer/hubris/blob/master/drv/gimlet-seq-server/src/main.rs#L763

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #2051 in the interest of getting this merged

Comment on lines +111 to +112
/// TODO: explain rationale for this value.
const TRACE_DEPTH: usize = 52;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-pasta from Gimlet 🙃

gain_i: 0.0135,
gain_d: 0.4,
min_output: 0.0,
max_output: 10.0, // XXX fix this before merging
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix this before merging

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathanaelhuffman now that we're getting CPU temperatures, can we bump this back up to 100%?

}

// In general, see RFD 276 Detailed Thermal Loop Design for references.
// TODO: temperature_slew_deg_per_sec is made up.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely!

mkeeter and others added 5 commits April 29, 2025 10:50

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@mkeeter mkeeter force-pushed the mkeeter/cosmo-support branch from 6c6ea7c to a07a13a Compare April 29, 2025 14:50
Copy link
Collaborator

@labbott labbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkeeter mkeeter merged commit cc3ff37 into master Apr 29, 2025
135 checks passed
@mkeeter mkeeter deleted the mkeeter/cosmo-support branch April 29, 2025 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants