-
Notifications
You must be signed in to change notification settings - Fork 4
RFC: Module Metadata Format #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,368 @@ | ||||||
| :fn-1: footnote:[RUOSO, Daniel. Module Metadata Format for Distribution with Pre-Built Libraries, 2024. https://wg21.link/P3286] | ||||||
| :fn-2: footnote:[RUOSO, Daniel. Translating Linker Input Files to Module Metadata Files, 2022. https://wg21.link/P2701R0] | ||||||
| :fn-3: footnote:[The concept of “logical name” is specified in BOECKEL, Ben. KING, Brad. Format for describing dependencies of source files. 2022. https://wg21.link/p1689r5] | ||||||
| :fn-4: footnote:[The JSON Data Interchange Syntax. http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf.] | ||||||
| :fn-5: footnote:[Austin Wright and Henry Andrews. JSON Schema: A Media Type for Describing JSON Documents. https://tools.ietf.org/html/draft-handrews-json-schema-01.] | ||||||
|
|
||||||
|
|
||||||
| [#proposal-template] | ||||||
| = Module Metadata Format | ||||||
| :rfcpr: 3 | ||||||
| :stdpr: 2 | ||||||
| :authors: Daniel Ruoso, Vito Gamberini | ||||||
| :email: [email protected], [email protected] | ||||||
| :copyright: Copyright 2025 | ||||||
| :license: Creative Commons Attribution 4.0 International License (CC BY 4.0) | ||||||
| :nofooter: | ||||||
| :reproducible: | ||||||
| :revdate: {docdate} | ||||||
| :sectanchors: | ||||||
| :sectnumlevels: 10 | ||||||
| :sectnums: | ||||||
| :source-highlighter: rouge | ||||||
| :toc-title: Contents | ||||||
| :toc: | ||||||
| :toclevels: 5 | ||||||
| :version-label!: | ||||||
|
|
||||||
| * *RFC PR*: https://github.com/ecostd/rfcs/pull/{rfcpr}[ecostd/rfcs/{rfcpr}] | ||||||
| // * *Standard PR*: https://github.com/ecostd/standard/pull/{stdpr}[ecostd/standard/{stdpr}] | ||||||
|
|
||||||
| [#abstract] | ||||||
| == Abstract | ||||||
|
|
||||||
| This RFC specifies the format for pre-built libraries to advertise the | ||||||
| metadata about the C++ Modules being provided, with the information required to | ||||||
| perform the translation of the Importable Units into Built Module Interface | ||||||
| files. It is effectively identical to the P3286, "Module Metadata Format | ||||||
| for Distribution with Pre-Built Libraries"{fn-1}, as originally presented to SG15, | ||||||
| seeking only to standardize existing practice. | ||||||
|
|
||||||
| [#motivation] | ||||||
| == Motivation | ||||||
|
|
||||||
| Compilers and C++ standard library implementations are, today, shipping module | ||||||
| descriptions based on the format proposed in P3286, "Module Metadata Format for | ||||||
| Distribution with Pre-Built Libraries". The format is rapidly gaining support | ||||||
| in build systems, and thus becoming entrenched as a de facto standard in the | ||||||
| ecosystem. | ||||||
|
|
||||||
| To perform any further innovation in the format we must first standardize this | ||||||
| existing practice. | ||||||
|
|
||||||
| [#scope] | ||||||
| == Scope | ||||||
|
|
||||||
| This RFC does not seek to innovate upon the work done in P3286 and current | ||||||
| implementations, only document and formalize the facts as they exist in build | ||||||
| systems and stdlibs today. To that end it reproduces the format described in | ||||||
| P3286 exactly. | ||||||
|
|
||||||
| This RFC neither adds nor subtracts anything to the format as originally | ||||||
| proposed, and is strongly opposed to any modifications. Further innovations | ||||||
| should be deferred to future RFCs that update the version and/or revision of | ||||||
| P3286 files appropriately. | ||||||
|
|
||||||
| Version 1/Revision 1 of this format has gone gold, this RFC is ex post facto | ||||||
| bookkeeping. Wording changes may be appropriate so long as they make no | ||||||
| structural or semantic changes to the format. | ||||||
|
|
||||||
| [#design] | ||||||
| == Design | ||||||
|
|
||||||
| === Requirements | ||||||
|
|
||||||
| This requirements summary is reproduced verbatim from P3286: | ||||||
|
|
||||||
| * A build system should have a way to identify which modules are provided by a | ||||||
| pre-built library. | ||||||
|
|
||||||
| * Locating the metadata file: | ||||||
|
|
||||||
| ** For the Standard Library: | ||||||
| *** The build system should be able to query the toolchain (either the | ||||||
| compiler or relevant packaging tools) for the location of that | ||||||
| metadata file. | ||||||
|
|
||||||
| ** Other Libraries: | ||||||
| *** In the absence of stronger package management, in environments where | ||||||
| that is viable, the build system may infer the location of the metadata | ||||||
| based on link-line fragments (P2701R0){fn-2}. | ||||||
| *** If package management is present, that information can be gathered in | ||||||
| implementation-defined ways. | ||||||
|
|
||||||
| ** The path to the metadata file should be related to the input files that | ||||||
| are given to the linker. The expectation is that different builds of the | ||||||
| library may have different metadata files. | ||||||
|
|
||||||
| * The contents of the metadata *must* include: | ||||||
| ** The “logical name”{fn-3} of the importable unit being provided. | ||||||
| ** The path to the primary source file for the importable unit. | ||||||
| ** Any additional include paths required to translate that particular | ||||||
| importable unit. | ||||||
| ** Any compiler definitions required to translate that particular importable | ||||||
| unit. | ||||||
| ** Whether the module is a module provided by the standard library or not, | ||||||
| since those module names are reserved. | ||||||
|
|
||||||
| * The contents of the metadata *may* include: | ||||||
| ** The “logical name” of importable units that are a dependency of that | ||||||
| translation unit. | ||||||
| ** Vendor-specific attributes. | ||||||
|
|
||||||
| === Format | ||||||
|
|
||||||
| The file will be encoded in JSON{fn-4}, and the data model is described in this | ||||||
| paper as a JSON Schema{fn-5}. Just like with P1689R5, the format will also | ||||||
| require that file paths must be constrained to valid UTF-8 sequences. | ||||||
|
|
||||||
| ==== Schema | ||||||
|
|
||||||
| For the information provided by the format, the following JSON Schema may be | ||||||
| used. | ||||||
|
|
||||||
| [source,javascript] | ||||||
| ---- | ||||||
| { | ||||||
| "$schema": "", | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Needs to be |
||||||
| "$id": "http://example.com/root.json", | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Will this get replaced with a real URI, perhaps published by this repo? FYI @grafikrobot since you might have thoughts on how schemas are published when required.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I did research this when creating the initial rfcs. What I concluded is that the most portable is to use a |
||||||
| "type": "object", | ||||||
| "title": "Ecostd C++ Module Metadata Format", | ||||||
| "definitions": { | ||||||
| "vendor": { | ||||||
| "$id": "#vendor", | ||||||
| "type": "object", | ||||||
| "description": "vendor-specific information. The key is the name of the vendor and the value is implementation defined.", | ||||||
| "patternProperties": { | ||||||
| "^.+$": { | ||||||
| "type": "object", | ||||||
| "description": "implementation-defined data for the vendor using that identifier" | ||||||
| } | ||||||
| } | ||||||
| }, | ||||||
| "datablock": { | ||||||
| "$id": "#datablock", | ||||||
| "type": "object", | ||||||
| "description": "A filepath", | ||||||
| "minLength": 1 | ||||||
| }, | ||||||
| "preprocessor-define": { | ||||||
| "$id": "#preprocessor-define", | ||||||
| "type": "object", | ||||||
| "description": "a definition to be set in the preprocessor", | ||||||
| "required": [ | ||||||
| "name" | ||||||
| ], | ||||||
| "properties": { | ||||||
| "name": { | ||||||
| "type": "string", | ||||||
| "description": "the name of the token to be defined in the preprocessor" | ||||||
| }, | ||||||
| "value": { | ||||||
| "type": "string", | ||||||
| "description": "the value to be set. If not present it is equivalent to -DFOO in gcc and clang", | ||||||
| "default": null | ||||||
| }, | ||||||
| "undef": { | ||||||
| "type": "boolean", | ||||||
| "default": false, | ||||||
| "description": "If set, instructs the preprocessor to make that value undefined. Equivalent to -UFOO in gcc and clang. Incompatible with using a value at the same time." | ||||||
| }, | ||||||
| "vendor": { | ||||||
| "$ref": "#/definitions/vendor" | ||||||
| } | ||||||
| } | ||||||
| }, | ||||||
| "local-arguments": { | ||||||
| "$id": "#local-arguments", | ||||||
| "type": "object", | ||||||
| "description": "Local arguments to be used when translating the module unit", | ||||||
| "properties": { | ||||||
| "include-directories": { | ||||||
| "type": "array", | ||||||
| "description": "An array of paths that need to be appended to the compilation include search path, same semantics as appending -I in gcc and clang.", | ||||||
| "items": { | ||||||
| "$ref": "#/definitions/datablock" | ||||||
| } | ||||||
| }, | ||||||
| "system-include-directories": { | ||||||
| "type": "array", | ||||||
| "description": "An array of paths that need to be appended to the compilation include path as system locations, same semantics as appending -isystem in gcc and clang.", | ||||||
| "items": { | ||||||
| "$ref": "#/definitions/datablock" | ||||||
| } | ||||||
| }, | ||||||
| "definitions": { | ||||||
| "type": "array", | ||||||
| "description": "An array of definitions for the preprocessor.", | ||||||
| "items": { | ||||||
| "$ref": "#/definitions/preprocessor-define" | ||||||
| } | ||||||
| }, | ||||||
| "vendor": { | ||||||
| "$ref": "#/definitions/vendor" | ||||||
| } | ||||||
| } | ||||||
| }, | ||||||
| "module": { | ||||||
| "$id": "#module", | ||||||
| "type": "object", | ||||||
| "description": "Metadata about a module provided by the library", | ||||||
| "required": [ | ||||||
| "logical-name", | ||||||
| "source-path" | ||||||
| ], | ||||||
| "properties": { | ||||||
| "logical-name": { | ||||||
| "$ref": "#/definitions/datablock" | ||||||
| }, | ||||||
| "is-interface": { | ||||||
| "type": "boolean", | ||||||
| "description": "True if this is an interface unit (primary or interface partition), false if it's an internal partition.", | ||||||
| "default": true | ||||||
| }, | ||||||
| "source-path": { | ||||||
| "$ref": "#/definitions/datablock" | ||||||
| }, | ||||||
| "is-std-library": { | ||||||
| "type": "boolean", | ||||||
| "description": "Whether this module is part of the standard library, and therefore allowed to use the reserved names", | ||||||
| "default": false | ||||||
| }, | ||||||
| "local-arguments": { | ||||||
| "$ref": "#/definitions/local-arguments", | ||||||
| "default": {} | ||||||
| }, | ||||||
| "vendor": { | ||||||
| "$ref": "#/definitions/vendor" | ||||||
| } | ||||||
| } | ||||||
| } | ||||||
| }, | ||||||
| "required": [ | ||||||
| "version" | ||||||
| ], | ||||||
| "properties": { | ||||||
| "version": { | ||||||
| "$id": "#version", | ||||||
| "type": "integer", | ||||||
| "description": "The version of the output specification" | ||||||
| }, | ||||||
| "revision": { | ||||||
| "$id": "#revision", | ||||||
| "type": "integer", | ||||||
| "description": "The revision of the output specification", | ||||||
| "default": 0 | ||||||
| }, | ||||||
| "modules": { | ||||||
| "$id": "#rules", | ||||||
| "type": "array", | ||||||
| "title": "rules", | ||||||
| "default": [], | ||||||
| "items": { | ||||||
| "$ref": "#/definitions/module" | ||||||
| } | ||||||
| } | ||||||
| } | ||||||
| } | ||||||
| ---- | ||||||
|
|
||||||
| ==== Examples | ||||||
|
|
||||||
| The following example represents what could be used for declaring modules that | ||||||
| are part of the standard library. | ||||||
|
|
||||||
| [source, javascript] | ||||||
| ---- | ||||||
| { | ||||||
| "version": 1, | ||||||
| "revision": 1, | ||||||
|
Comment on lines
+278
to
+279
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to use regular semver on this? As that's what ecostd is trying to coalesce to? |
||||||
| "modules": [ | ||||||
| { | ||||||
| "logical-name": "std", | ||||||
| "source-path": "modules/std.cppm", | ||||||
| "is-std-library": true | ||||||
| }, | ||||||
| { | ||||||
| "logical-name": "std.compat", | ||||||
| "source-path": "modules/std.compat.cppm", | ||||||
| "is-std-library": true | ||||||
| }, | ||||||
| { | ||||||
| "logical-name": "std:someinterfacepartition", | ||||||
| "source-path": "modules/std-someinterfacepartition.cppm", | ||||||
| "is-std-library": true | ||||||
| } | ||||||
| ] | ||||||
| } | ||||||
| ---- | ||||||
|
|
||||||
| The following example represents modules provided by an arbitrary other library | ||||||
| with additional preprocessor requirements. | ||||||
|
|
||||||
| [source, javascript] | ||||||
| ---- | ||||||
| { | ||||||
| "version": 1, | ||||||
| "revision": 1, | ||||||
| "modules": [ | ||||||
| { | ||||||
| "logical-name": "foo", | ||||||
| "source-path": "modules/foo.cppm", | ||||||
| "local-arguments": { | ||||||
| "definitions": [ | ||||||
| { | ||||||
| "name": "FOO_CONFIG_VALUE", | ||||||
| "value": 42 | ||||||
| } | ||||||
| ] | ||||||
| } | ||||||
| } | ||||||
| ] | ||||||
| } | ||||||
| ---- | ||||||
|
|
||||||
| ==== Resolving relative paths | ||||||
|
|
||||||
| The build system will get the path to this file by either asking the toolchain | ||||||
| or an underlying package manager for it. The path provided to this file should | ||||||
| be used as is, without any additional symbolic link resolution. | ||||||
|
|
||||||
| Any file or directory referenced by the metadata file in relative form should be | ||||||
| considered relative to the path provided. Any relative path in the metadata file | ||||||
| will be resolved based on the path provided by the toolchain or package manager. | ||||||
|
Comment on lines
+331
to
+333
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not clear on this.. Are things relative to the directory of the metadata file? What happens if the given path is itself relative? Is that possible? |
||||||
|
|
||||||
| [#prior-art] | ||||||
| == Prior Art | ||||||
|
|
||||||
| This format is already implemented by every major standard library vendor for | ||||||
| description of the `std` and `std.compat` modules. It is used by CMake to | ||||||
| support `import std;` across all implementations. | ||||||
|
|
||||||
| [#considerations] | ||||||
| == Considerations | ||||||
|
|
||||||
| What extension should module metadata use?:: | ||||||
|
|
||||||
| In practice, the extension `.modules.json` has seen widespread use. | ||||||
| Standardizing the exact extension seems excessive. The module metadata file | ||||||
| is typically located via packaging mechanisms and thus the exact names and | ||||||
| file extensions should not be semantically relevant. | ||||||
|
Comment on lines
+347
to
+350
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it too early to link to CPS and how it wires up this info?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll update with it when it merges CPS-side (cps-org/cps#95). Should be in a day or two if not today. |
||||||
|
|
||||||
| How is module metadata versioned?:: | ||||||
|
|
||||||
| It follows the same versioning semantics described in P1689. | ||||||
|
|
||||||
| What do we call files in this format?:: | ||||||
|
|
||||||
| The information itself is described as "module metadata". In the vein of | ||||||
| "archives", "dynamic libraries", or "translation unit", files in the format | ||||||
| described here are usually called "modules json". | ||||||
|
|
||||||
| [#license] | ||||||
| == License | ||||||
|
|
||||||
| This work is licensed under the Creative Commons Attribution 4.0 International | ||||||
| License. To view a copy of this license, visit | ||||||
| http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative | ||||||
| Commons, PO Box 1866, Mountain View, CA 94042, USA. | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a pain, but I'm not aware of a standard document anywhere that actually defines some of these terms. Such as "Build Module Interface". That probably means it goes in the EcoStd and this RFC.
I believe P0947 has initial definitions for a lot of terms. P1687 is mainly published minutes, but it discusses a bikeshedding poll to select BMI as a term.
I also found some definitions in the clang docs on C++ modules:
https://clang.llvm.org/docs/StandardCPlusPlusModules.html#built-module-interface
I'm assuming @Bigcheese reviewed those docs at some point.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this while watching game 1, and that's what I'm going to use as my excuse for all the mistakes. I think the more correct term for what this RFC describes are "module interface units", but the existing wording from P3286 uses "importable unit" so I did as well.
AFAIK "Built Module Interface" comes from P2581, which is where the meaning on the "B" was originally legislated.
I'm down to RFC a terms glossary separately. I don't think it's a blocker for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will need to add definitions to EcoStd for these. We can probably steal the ones from wg21. But we could adjust the definitions to our current understanding.