stdenv: Add CPE fields to meta #409797

YorikSar · 2025-05-22T15:14:14Z

Add identifiers attr to meta attribute with following attrs:

cpe with the full CPE string when available
cpeParts with the destructured CPE string, allowing to override it whenever needed
v1 attribute set with cpe and cpeParts from above and a guarantee of a backwards-compatible interface

Also add vendor part as an example to packages: hello, gcc, clang.

Related issue: #354012

This is the first step towards adding software identifiers to Nixpkgs. See issue for the discussion.

Here's my summary of the discussion and decisions made here:

Proposal

Interested actors

Nixpkgs authors
- want to provide necessary details with as little effort as possible
Nixpkgs consumers
- want to be able to use external tools to find known vulnerabilities
Tools
- SBOM generators (bombon, sbomnix)
  - want to use less heuristic, more precision for identifiers in generated SBOMs
- Security tracker
  - wants to match identifiers against external vulnerability databases, avoid heuristics
  - wants to provide tools to enrich Nixpkgs with more data and more precision (detect inconsistencies, generate PRs)
  - wants to become a point of contact with external vulnerability databases, providing them with more data

Options

Identifier types

It seems that the most common identifier formats are CPE and PURL.

CPE

CPE comes from NIST, with the official list of CPE names maintained in NVD.

CPE looks like cpe:2.3:a:gnu:glibc:2.40:1:::::: with parts meaning:

CPE version - current version of CPE is 2.3
part - a for "application"
vendor - can point to the source of the package, or to Nixpkgs itself
product - name of the package
version - version of the package
update - name of the latest update, can be a patch version as in the example
edition - any additional specification about the version
many more fields that seem to be generally unused for software

CPE allows to be very specific, but requires knowledge about vendor and the versioning process of the software to match NVD data.

NVD attaches CPE identifiers to CVE entries, but this mapping is by no means full. Some CVE entries have no CPE, some have strangely formatted ones.

If we maintain a good list of CPE identifiers of our own, we could influence NVD to make CVE database better in this regard.

PURL

PURL stand for "package URL" (not "persistent URL" from 1990s). They are adopted by many SBOM replated tools and OSV database.

PURLs look like pkg:deb/debian/[email protected]?arch=i386&distro=jessie, with mandatory parts:

pkg: is the URL schema
type - comes from the list of known PURL types and defines what namespace and qualifiers mean.
namespace - whatever the type requires, for example, it can be empty or point to a specific repository
name - name of the package
version - version of the package, in type specific format
qualifiers - depend on the type

While PURLs are less specific, they can be derived completely from the information about package sources.

Structuring

We can write and present identifiers either just as a whole string (e.g. in hello.meta.cpe or hello.meta.purl), providing authors with functions to generate appropriate values, for example:

drv = stdenv.mkDerivation (finalAttrs: {
  pname = "foo";
  version = "1.2.3";
  ...
  meta.cpe = lib.mkCPE finalAttrs {vendor = "foo_upstream";};
})

drv.meta.cpe => "cpe:2.3:a:foo_upstream:foo:1.2:3::::::"

The other option is to destructure these values both on input and on output and make all generation logic implicit in mkDerivation, for example:

drv = stdenv.mkDerivation (finalAttrs: {
  pname = "foo";
  version = "1.2.3";
  ...
  meta.cpeParts.vendor = "foo_upstream";
})

drv.meta.cpe => "cpe:2.3:a:foo_upstream:foo:1.2:3::::::"
drv.meta.cpeParts => {
  vendor = "foo_upstream";
  product = "foo";
  version = "1.2";
  update = "3";
}

In both cases Nixpkgs authors would be able to provide just the information that cannot be correctly derived from other arguments.

Consumers that want to match these identifiers against upstream databases would only need the final identifier available. However, in cases when direct matching doesn't yield gooed enough results, they might rely on heuristics that require identifier's consituent parts. Both formats are easy to parse, but it can be benefitial to not have to do this additional step.

Tools like Security Tracker want to be able to update these identifiers in Nixpkgs. Finding the place to edit necessary part seems to be easier in the second example.

Versioning

Consumers that have to support multiple versions of Nixpkgs, want to distinguish which set of fields they can expect to read and write in Nixpkgs. Currently this is not supported anywhere, but we could start with namespacing package identifiers with their own version. For example:

drv = stdenv.mkDerivation (finalAttrs: {
  pname = "foo";
  version = "1.2.3";
  ...
  meta.tracking-information.v1.cpeParts.vendor = "foo_upstream";
})

drv.meta.tracking-information.v1.cpe => "cpe:2.3:a:foo_upstream:foo:1.2:3::::::"

Note that this forces versioning on Nixpkgs authors as well as consumers, which increases cognitive load while writing package definitions (which version do I need to write? are there other versions? what do I need to do to support them?). Neither Nixpkgs authors nor "single-version" Nixpkgs consumers (ones that don't collect data over multiple Nixpkgs versions) benefit from such versioning.

We could provide a stable versioned output for this information while keeping input simple:

drv = stdenv.mkDerivation (finalAttrs: {
  pname = "foo";
  version = "1.2.3";
  ...
  meta.tracking-information.cpeParts.vendor = "foo_upstream";
})

drv.meta.tracking-information.cpe => "cpe:2.3:a:foo_upstream:foo:1.2:3::::::"
drv.meta.tracking-information.v1.cpe => "cpe:2.3:a:foo_upstream:foo:1.2:3::::::"

In this example tracking-information.v1 would be always backwards-compatible, providing fields like cpe, cpeParts, purl, purlParts and possible some new ones in the future.

Add a 👍 reaction to pull requests you find important.

h0nIg · 2025-05-26T07:13:48Z

While PURLs are less specific, they can be derived completely from the information about package sources.

thats not true, take the following examples which are hosted on github.com and requires pkg:github/xxx instead of an tar.gz download. Therefore we need the possibility to specify purl's as well to avoid loosing informations. Workarounds like long-running nixtract to gather meta informations is not an option, as it is a reverse engineering approach.

jq, fetchurl: https://github.com/NixOS/nixpkgs/blob/nixos-25.05/pkgs/by-name/jq/jq/package.nix#L18
python, fetchurl: https://github.com/NixOS/nixpkgs/blob/nixos-25.05/pkgs/development/interpreters/python/cpython/default.nix#L265

the same applies for a ruby gem (rubygems.org) / python lib (pypi) which can get downloaded through github.com as well (pkg:github vs. pkg:pypi / pkg:gems). Some packages just match for "pkg:pypi" and not pkg:github, because SBOM scanners have strict requirements about how to detect new releases (not just tags, tags + releases are mandatory)

YorikSar · 2025-05-27T13:11:49Z

@h0nIg Good point. I didn't mean to imply that PURLs would not have an option to override them when needed. I decided to start with CPE just for example, but I'd expect to have a similar structure for PURLs as well. You would be able to override any part of the PURL, or all of it at once. We could also have tools like Security Tracker suggesting fixes for inconsistent autogenerated PURLs, just like for CPEs.

fricklerhandwerk · 2025-06-13T13:48:16Z

Pinging @Mic92 @Lassulus @RaitoBezarius @arianvp @blitz @nikstur from the original issue FYI

nikstur · 2025-06-14T19:53:11Z

I decided to start with CPE

I think that's the right approach: focus on one type of identifier in the beginning. Trying to build the perfect all-encompassing solution will just leave us deadlocked and unable to make any progress.

YorikSar · 2025-06-19T14:02:04Z

I've added a section in the manual to cover these meta attributes. There is space to add more identifiers in the future there. Please take a look at it.

h0nIg · 2025-06-19T14:54:50Z

pkgs/stdenv/generic/check-meta.nix

+            language = "";
+            other = "";
+          } // attrs.meta.identifiers.cpeParts or { };
+          cpe =


i would make CPE a list.

reason: https://nvd.nist.gov/vuln/detail/cve-2024-12084

CPE: cpe:2.3:a:samba:rsync:3.2.7:-:*:*:*:*:*:*

the patch version is not contained in the cpeParts.update but in the cpeParts.version part. You should not make any assumption how vendors use the CPE, therefore match the 2 combinations by default

I disagree. Such ambiguity would only make it harder to support a clean source of truth. If rsync doesn't use update field, we should specify this in its definition. Supporting all variations of all approaches is not feasible. Splitting out patch version from semantic versions seems good enough default as it covers many packages.

Why do you want to avoid ambiguity? You probably want to upload the generated CPEs to check for findings. You can not ask the vendor to change its processes, nor you support the maintainer of such problem components. Some people have a different understanding of semantic versioning and its even hard to cope with different processes.

I can share some experience with the (here unrelated) PURLs: some of the components are matched e.g. with pkg:gems/myrubygem and others with pkg:github/org/myrubygem. Conclusion: list of multiple identifiers, all of them are valid. Where / why do you see postprocessing / further structured access required? In addition once they can access the derivation, they have anyhow access to all attributes?

Please share examples or reasons, because i think we need a pragmatic approach here

Thats a good examples, take a list and search (e.g. once per day): https://nvdlib.com/en/latest/v2/CVEv2.html#searching-cves

I've added examples for rsync and bash. My assumption is that the semantic version splitting works for the majority of cases. If that's not correct, I'd prefer to remove this assumption instead of trying to accommodate all possible variations.

The goal here is to produce a mapping from the package to the identifier of this package. So far you provided an example of rsync that just uses different values in the CPE fields, so we should specify this just for the rsync package. CPE cpe:2.3:a:samba:rsync:3.2:7:*:*:*:*:*:* is not valid for rsync, just as CPE cpe:2.3:a:gnu:hello:2.12.2:-:*:*:*:*:*:* is not valid for hello, so we shouldn't have these guesses attached to these versions of these packages. Tools like Security Tracker can use data from cpeParts field to query vulnerabilities using different formats. The idea is that in the end there will be only one correct CPE for the specific version of the specific package, and we want to provide it and only it in Nixpkgs. In case Security Tracker finds that this identifier is wrong, we should fix the identifier.

I've added examples for rsync and bash. My assumption is that the semantic version splitting works for the majority of cases. If that's not correct, I'd prefer to remove this assumption instead of trying to accommodate all possible variations.

i have to admit, that my proposal to include the (.|p) into the regex is too ambitious, given additional cases like this:
https://github.com/NixOS/nixpkgs/blob/master/pkgs/by-name/su/sudo/package.nix#L20
i like pushing a limited amount of special cases to the maintainer, if they are not covered by the general approach. Pushing the majority of problems with semantic versioning including patch versions to the maintainer, is not a limited amount. Please use a list instead

The goal here is to produce a mapping from the package to the identifier of this package
wrong assumption, there is no mapping from the package to the identifier. OR: there is no single mapping.
As said: the goal is to upload a list of CPE's and check for CVE. this list is to be understood as a query which needs to be wildcard as much as possible.

The idea is that in the end there will be only one correct CPE for the specific version of the specific package, and we want to provide it and only it in Nixpkgs
CPE = query parameters

You can not make assumptions how people will request CVE, even under pressure, please use wildcard even for "hello":

so instead of

cpe:2.3:a:gnu:hello:2.12.2:-:*:*:*:*:*:*

you should create the search query with *:

cpe:2.3:a:gnu:hello:2.12.2:*:*:*:*:*:*:*

Tools like Security Tracker can use data from cpeParts field to query vulnerabilities using different formats

@fricklerhandwerk can you please share some insights and/or requirements? Do you really want to parse cpeParts and postprocess or do you just want to upload the list of CPE for vulnerability pre-matching

Could it be that we have two pieces of software in one derivation so that we need two identifiers for them? I think that could happen, so making it a list is just future proof even if it is only used by 5 packages and not even by default.

pkgs/stdenv/generic/check-meta.nix

philiptaron · 2025-08-22T19:16:33Z

My belief, after reading through the feedback, is that this is good to merge and iterate on once @infinisil approves it. Is that correct?

h0nIg · 2025-08-28T11:15:30Z

My belief, after reading through the feedback, is that this is good to merge and iterate on once @infinisil approves it. Is that correct?

yes, can this get merged now? or at least can @infinisil approve?

Add `identifiers` attr to `meta` attribute with following attrs: * `cpe` with the full CPE string when available * `possibleCPEs` with the list of potential CPEs when not all information is provided * `cpeParts` with the destructured CPE string, allowing to override it whenever needed * `v1` attribute set with `cpe` and `cpeParts` from above and a guarantee of a backwards-compatible interface Related issue: NixOS#354012

infinisil

Sorry for the delay, looks good to me too!

nixpkgs-ci · 2025-08-29T19:48:36Z

Backport failed for release-25.05, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin release-25.05
git worktree add -d .worktree/backport-409797-to-release-25.05 origin/release-25.05
cd .worktree/backport-409797-to-release-25.05
git switch --create backport-409797-to-release-25.05
git cherry-pick -x b0ce3dc09f5146e831f1116e95b99017fc0f4a64

h0nIg · 2025-08-29T20:57:45Z

@YorikSar @fricklerhandwerk @infinisil @nikstur happy to get an review & approval for the matching pURL parts as well: #421125, thank you

YorikSar · 2025-08-29T20:57:51Z

I did a manual backport to 25.05 here: #438385

K900 · 2025-08-30T06:26:10Z

https://hydra.nixos.org/build/306260260/nixlog/1

vcunat · 2025-08-30T06:50:44Z

Trivial reproducer:

$ nix-env -f. -qa --meta --xml -A a4 >/dev/null
derivation 'a4-0.2.3' has invalid meta attribute 'identifiers'

vcunat · 2025-08-30T07:01:18Z

I believe that nix-env dislikes when meta contains null values (possibly nested in deeper attributes). The state of the PR right now:

nix-repl> :p a4.meta.identifiers
{
  cpe = null;
  cpeParts = {
    edition = "*";
    language = "*";
    other = "*";
    part = "a";
    product = "a4";
    sw_edition = "*";
    target_hw = "*";
    target_sw = "*";
    update = null;
    vendor = null;
    version = null;
  };
  possibleCPEs = [ ];
  v1 = {
    cpe = null;
    cpeParts = «repeated»;
    possibleCPEs = [ ];
  };
}

YorikSar · 2025-08-30T07:18:05Z

I believe that nix-env dislikes when meta contains null values (possibly nested in deeper attributes).

You are correct, the error message comes from here: https://github.com/NixOS/nix/blob/401e7fe3ad2d01bab628c50bb34450e29d95882b/src/nix/nix-env/nix-env.cc#L1227

It’s bad that the error was only discovered after merge since we don’t seem to use nix-env in GitHub CI. Is also a strange requirement on the nix-env part. I’ll post a fixed version of this later.

vcunat · 2025-08-30T07:29:08Z

The problem in nix-env doesn't manifest until you force it to print meta, so it's easy to miss.

h0nIg · 2025-08-31T20:40:29Z

pkgs/stdenv/generic/check-meta.nix

+                  cpe = (makeCPE guessedParts);
+                }
+              ) possibleCPEPartsFuns;
+          v1 = { inherit cpeParts cpe possibleCPEs; };


v1 = filterAttrsRecursive (n: v: v != null) {
should solve these nix-env problems?

It would, but it's better to not add null values there in the first place instead. filterAttrsRecursive is quite expensive to run for each derivation.

@K900

nix-env writes a warning for each derivation that has null in its meta values, so fields without known values are removed from the result. Fixes issue raised by @K900 in NixOS#409797 (comment)

YorikSar

PR that reapplies this with a fix for nix-env: #439074

YorikSar · 2025-09-01T06:02:36Z

pkgs/stdenv/generic/check-meta.nix

+                  cpe = (makeCPE guessedParts);
+                }
+              ) possibleCPEPartsFuns;
+          v1 = { inherit cpeParts cpe possibleCPEs; };


It would, but it's better to not add null values there in the first place instead. filterAttrsRecursive is quite expensive to run for each derivation.

@K900

nix-env writes a warning for each derivation that has null in its meta values, so fields without known values are removed from the result. Fixes issue raised by @K900 in NixOS#409797 (comment)

@K900

nix-env writes a warning for each derivation that has null in its meta values, so fields without known values are removed from the result. Fixes issue raised by @K900 in NixOS#409797 (comment)

@K900

nix-env writes a warning for each derivation that has null in its meta values, so fields without known values are removed from the result. Fixes issue raised by @K900 in NixOS#409797 (comment) (cherry picked from commit a178fd8)

github-actions bot added 6.topic: stdenv Standard environment 6.topic: llvm/clang Issues related to llvmPackages, clangStdenv and related labels May 22, 2025

github-project-automation bot added this to Stdenv May 22, 2025

YorikSar mentioned this pull request May 22, 2025

CPEs in package metadata for better vulnerability tracking #354012

Closed

YorikSar force-pushed the cpe branch 3 times, most recently from e48d97e to 9094380 Compare May 27, 2025 15:02

github-actions bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. labels May 27, 2025

fricklerhandwerk requested review from RaitoBezarius and nikstur June 13, 2025 13:47

YorikSar force-pushed the cpe branch from 9094380 to b80a958 Compare June 19, 2025 14:00

YorikSar marked this pull request as ready for review June 19, 2025 14:01

fricklerhandwerk approved these changes Jun 19, 2025

View reviewed changes

github-actions bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Jun 19, 2025

h0nIg reviewed Jun 19, 2025

View reviewed changes

pkgs/stdenv/generic/check-meta.nix Outdated Show resolved Hide resolved

pkgs/stdenv/generic/check-meta.nix Outdated Show resolved Hide resolved

pkgs/stdenv/generic/check-meta.nix Outdated Show resolved Hide resolved

This comment was marked as duplicate.

Sign in to view

philiptaron added the 1.severity: significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. label Aug 22, 2025

wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Aug 23, 2025

oneingan requested a review from infinisil August 29, 2025 02:07

YorikSar force-pushed the cpe branch from d6331e8 to b0ce3dc Compare August 29, 2025 13:40

nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Aug 29, 2025

infinisil approved these changes Aug 29, 2025

View reviewed changes

infinisil merged commit e83e8da into NixOS:master Aug 29, 2025
28 of 31 checks passed

infinisil deleted the cpe branch August 29, 2025 19:47

github-project-automation bot moved this from In Progress to Done in Stdenv Aug 29, 2025

YorikSar mentioned this pull request Aug 29, 2025

[Backport release-25.05] stdenv: Add CPE fields to meta #438385

Open

1 task

K900 mentioned this pull request Aug 30, 2025

Revert "stdenv: Add CPE fields to meta" #438527

Merged

h0nIg reviewed Aug 31, 2025

View reviewed changes

YorikSar mentioned this pull request Sep 1, 2025

Reapply "stdenv: Add CPE fields to meta" #439074

Merged

YorikSar commented Sep 1, 2025

View reviewed changes

mdaniels5757 added the 8.has: port to stable This PR already has a backport to the stable release. label Oct 4, 2025

Uh oh!

Uh oh!

stdenv: Add CPE fields to meta #409797

stdenv: Add CPE fields to meta #409797

Uh oh!

Conversation

YorikSar commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Interested actors

Options

Identifier types

CPE

PURL

Structuring

Versioning

Uh oh!

h0nIg commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

YorikSar commented May 27, 2025

Uh oh!

fricklerhandwerk commented Jun 13, 2025

Uh oh!

nikstur commented Jun 14, 2025

Uh oh!

YorikSar commented Jun 19, 2025

Uh oh!

h0nIg Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

YorikSar Jun 19, 2025

Choose a reason for hiding this comment

Uh oh!

h0nIg Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

h0nIg Jun 21, 2025

Choose a reason for hiding this comment

Uh oh!

YorikSar Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

h0nIg Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SuperSandro2000 Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as duplicate.

Uh oh!

philiptaron commented Aug 22, 2025

Uh oh!

h0nIg commented Aug 28, 2025

Uh oh!

infinisil left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nixpkgs-ci bot commented Aug 29, 2025

Uh oh!

h0nIg commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

YorikSar commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

K900 commented Aug 30, 2025

Uh oh!

vcunat commented Aug 30, 2025

Uh oh!

vcunat commented Aug 30, 2025

Uh oh!

YorikSar commented Aug 30, 2025

Uh oh!

YorikSar commented May 22, 2025 •

edited

Loading

h0nIg commented May 26, 2025 •

edited

Loading

h0nIg Jun 19, 2025 •

edited

Loading

h0nIg Jun 21, 2025 •

edited

Loading

h0nIg Jul 1, 2025 •

edited

Loading

h0nIg commented Aug 29, 2025 •

edited

Loading

YorikSar commented Aug 29, 2025 •

edited

Loading

vcunat commented Aug 30, 2025 •

edited

Loading