Skip to content

nixos/grub: add new implementation of install-grub.pl#317026

Open
pluiedev wants to merge 3 commits intoNixOS:masterfrom
pluiedev:init/install-grub-ng
Open

nixos/grub: add new implementation of install-grub.pl#317026
pluiedev wants to merge 3 commits intoNixOS:masterfrom
pluiedev:init/install-grub-ng

Conversation

@pluiedev
Copy link
Member

@pluiedev pluiedev commented Jun 3, 2024

Description of changes

Following in the footsteps of Perlless Activation (#270727) and Perlless switch-to-configuration (#308801), it's now time for Perlless install-grub, which can be enabled by simply setting boot.loader.grub.useInstallNg to true.

This is still a draft PR for now since I don't use GRUB myself, and I could do with some help from GRUB users who could actually test this :) The GRUB config generation logic seems to work pretty well, though.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jun 3, 2024
@pluiedev pluiedev force-pushed the init/install-grub-ng branch 5 times, most recently from 6327db4 to eff29a0 Compare June 3, 2024 21:53
@ofborg ofborg bot added 8.has: package (new) This PR adds a new package 11.by: package-maintainer This PR was created by a maintainer of all the package it changes. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Jun 4, 2024
@pluiedev
Copy link
Member Author

pluiedev commented Jun 4, 2024

Found a couple of (serious) bugs, fortunately there's a test I can use to catch them... :D

@pluiedev pluiedev force-pushed the init/install-grub-ng branch 2 times, most recently from 339c8db to f8e1766 Compare June 6, 2024 19:48
@pluiedev
Copy link
Member Author

pluiedev commented Jun 6, 2024

Ready for review

@pluiedev pluiedev marked this pull request as ready for review June 6, 2024 19:58
@pluiedev pluiedev requested a review from dasJ as a code owner June 6, 2024 19:58
@nyabinary
Copy link
Contributor

Should probably post this in the discourse thread of prs ready to review :P

@Scrumplex
Copy link
Member

@ofborg test grub grub-ng

@pluiedev pluiedev force-pushed the init/install-grub-ng branch from f8e1766 to c0234b2 Compare November 5, 2024 12:28
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/prs-ready-for-review/3032/4819

@KiaraGrouwstra
Copy link
Contributor

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review pr 317026


x86_64-linux

⏩ 2 packages blacklisted:
  • nixos-install-tools
  • tests.nixos-functions.nixos-test
✅ 1 package built:
  • install-grub-ng

Copy link
Contributor

@ThinkChaos ThinkChaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!
It's what's blocking me from going perl-less. I'll give it a try when I have time, though my setup is not likely to trigger many edge cases.

Generally it seems like we should use eyre a lot more: most code just uses ? without adding any context. That'll lead to errors users can't understand 🙁
IMO, it's fine for a prototype to do that, but the best time for error handling is when you write the code, and the second best is when there's a full review. Otherwise you end up having to go over every thing again and it's very easy to miss some places.

I haven't done a full review but already spent enough time one this now. Here's the comments I wrote for now.
Don't hesitate to tell me I'm wrong or whatever in the comments :)

Comment on lines 39 to 40
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any parent dirs will have unexpected permissions here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If create_dir_all creates any parent directories, they won't use the permission we set just below.

I'm not sure what's the best thing to do here.

@pluiedev pluiedev force-pushed the init/install-grub-ng branch 4 times, most recently from ac96ddc to 9c7825b Compare December 5, 2024 12:00
@github-actions github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. and removed 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. labels Dec 5, 2024
@pluiedev pluiedev force-pushed the init/install-grub-ng branch 2 times, most recently from 37e7aea to 98ee2b7 Compare December 5, 2024 13:02
@github-actions github-actions bot added 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. and removed 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. labels Dec 5, 2024
@pluiedev pluiedev force-pushed the init/install-grub-ng branch from 98ee2b7 to d6bd08b Compare December 5, 2024 13:31
@symphorien
Copy link
Member

(I didn't read the PR but the os prober test passes, great!)

@xddxdd
Copy link
Member

xddxdd commented Dec 8, 2024

I tested this and the grub.cfg generated contains errors that would prevent boot. The kernel/initrd references ($drive1)/kernels/name instead of the actual path to kernel/initrd.

Example entry from this PR:

menuentry "NixOS - Configuration 730 (2024-11-25 - 25.05pre-git)" --class nixos {
  savedefault
search --set=drive1 --fs-uuid 5573-574D
  linux ($drive1)/kernels/name init=/nix/store/myp185i80v1ibjicj33bv1jqm8vwzk98-nixos-system-lt-hp-omen-25.05pre-git/init [args...]
  initrd ($drive1)/kernels/name 
}

Example entry from old script:

menuentry "NixOS - Configuration 730 (2024-11-24 - 25.05pre-git)" --class nixos {
  savedefault
search --set=drive1 --fs-uuid 5573-574D
  linux ($drive1)//kernels/i52qc8wmsbldpzh5sqy3591c9lzwa6pz-linux-cachyos-latest-6.12.1-bzImage init=/nix/store/myp185i80v1ibjicj33bv1jqm8vwzk98-nixos-system-lt-hp-omen-25.05pre-git/init [args...]
  initrd ($drive1)//kernels/hh0dadclbbfnbrd3xhc1nbdy8ppskrfz-initrd-linux-cachyos-latest-6.12.1-initrd
}

I have an impermanence setup. This is my partition setup for reference:

Device           Start        End    Sectors  Size Type
/dev/nvme2n1p1    2048    1050623    1048576  512M EFI System - /boot partition, FAT32
/dev/nvme2n1p2 1050624 7814037134 7812986511  3.6T Linux filesystem - /nix partition, Btrfs

@pluiedev
Copy link
Member Author

pluiedev commented Dec 8, 2024

Might have missed a variable reference there when porting? (Note to self: never touch Perl ever again)

@rnhmjoj
Copy link
Contributor

rnhmjoj commented Jan 10, 2025

So, you could say I'm one of the maintainers of grub: I use it on both EFI and non-EFI systems and I wrote the NixOS tests.
I'll try to test this when I have some spare time.

As a first impression: can we try to keep it simpler?
This implementation is almost 2000 lines split over 7 files: it looks quite impenetrable to me.
Compared, install-grub.pl is just 800 lines of linear imperative code: I understand it well enough even without knowing perl.
Say, recently I was extending the script to generate boot loader entries, here I wouldn't even know to where start.

@pluiedev
Copy link
Member Author

pluiedev commented Jan 10, 2025

Most of the complexity is simply handling the error cases that the Perl code just skips over — more rigorous input parsing, more rigorous subprocess spawning, etc. Rust is in general a much more explicit language than Perl (which is infamous for being so terse that it's impenetrable).

As to where to add the extra bootloader entries, it's aptly under src/builder/entries.rs :p

@nyabinary
Copy link
Contributor

What's necessary for this to move forward?

@pluiedev
Copy link
Member Author

Mostly to fix some bugs with the code output (see comments above). I'm a bit burnt out from Nix work at the moment but I'll come back to this in due time

let line = line?;

let mut fields = line.split(' ');
let Some(mount_point) = fields.nth(4).map(Path::new) else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least the root, mount point, and device fields in /proc/$PID/mountinfo are POSIX paths, meaning they are strings of bytes that do not contain a null byte. Given that the format of/proc/$PID/mountinfo relies on space-separated fields in line-separated records, it escapes several items in path-type fields by replacing them with a \ followed by three octal digits encoding the byte (e.g. a UTF-8 newline is replaced by \012). You'll need to decode these escapes before using the resulting OsString to construct a Path object, if you want to be robust against weird paths. install-grub.pl doesn't do this, but it's buggy overall.

Possibly some of the other fields have this kind of escaping also, not sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do they always use octal escapes and never other escape formats (e.g. hexadecimal escapes)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a look at the kernel code generating the mountinfo pseudo-files, and yep. It is also only those three fields which get this escaping/mangling, too, looks like.

};

let Some(ext) = splash_image.extension() else {
bail!("Splash image has no extension - could not decide which module to load!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you're using eyre1 and returning its Result type already, you can avoid these rather verbose Option matches and replace them with this arguably more readable construct, converting the Option to a Result with error context and opening it with the try operator:

let ext = splash_image.extension().ok_or_eyre("Splash image has no extension - could not decide which module to load!")?;

1: Also, what's the motivation for using eyre instead of just anyhow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I preferred eyre when I first wrote this, but I've since switched to anyhow — haven't gotten to porting this over yet.

};

// Skip the bind-mount for the Nix store.
if mount_point == store_dir && super_options.any(|s| s == "rw") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it should be checking mount_options rather than super_options here, as the latter is always going to contain rw, for both the bind mount and non-bind mount, at least on the systems I've checked.

That would also explain why mount_options appears to be unused in the Perl version: they accidentally used the wrong variable here. Although I guess that never mattered enough to cause a bug in practice, or at least not one that anyone tracked down...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes so much sense actually. I was confused by this as well xD

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Apr 2, 2025
@nixpkgs-ci nixpkgs-ci bot added 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md and removed 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md labels Jun 25, 2025
@nixpkgs-ci nixpkgs-ci bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Aug 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: package (new) This PR adds a new package 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 11.by: package-maintainer This PR was created by a maintainer of all the package it changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.