Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic import maps #10528

Open
wants to merge 50 commits into
base: main
Choose a base branch
from
Open

Conversation

yoavweiss
Copy link
Contributor

@yoavweiss yoavweiss commented Jul 30, 2024

Introduction

Import maps in their current form provide multiple benefits to web developers. They enable them to avoid cache invalidation cascades, and to be able to work with more ergonomic bare module identifiers, mapping them to URLs in a convenient way, without worrying about versions when importing.

At the same time, the current import map setup suffers from fragility. Only a single import map can be loaded per document, and it can only be loaded before any module script is loaded. Once a single module script is loaded, import maps are disallowed.
That creates a situation where developers have to think twice (or more) before using module scripts in situations that may introduce import maps further down in the document. It also means that using import maps can carry a risk unless you’re certain you can control all the module scripts loaded on the page.

Beyond that, the fact that import maps have to be loaded before any module means that the map itself acts as a blocking resource to any module functionality. Large SPAs that want to use modules, have to download the map of all potential modules they may need during the app’s lifetime ahead of time.

So, it seems like there’s room for improvement. Enabling more dynamic import maps would allow developers to avoid these issues and fully benefit from import maps’ caching and ergonomic advantages without incurring a cost when it comes to stability or performance.

At the same time, the current static design gives us determinism and isn’t racy. A module identifier that resolves to a certain module will continue to do so throughout the lifetime of the document. It would be good to keep that characteristic.

Objectives

Goals

  • Increase robustness when using ES modules and import maps
  • Enable expanding the Window’s import map throughout its lifetime
  • Satisfy the EcmaScript HostLoadImportedModule requirement that multiple calls will always resolve to the same module
  • Minimize race-conditions which can result in different module resolutions on different loading sequences of the same page.

Non-Goals

  • Provide a programmatic way to expand or modify the Window’s import map - out-of-scope for the current effort
  • Completely avoid race conditions that can result in different module resolutions based on network conditions
    • Such races are already possible today (e.g. if an import map is dynamically injected by a classic script which may or may not run before a module is loaded)
    • Specifically for dynamic modules, requiring this would conflict with the “expanding the import map over time” goal.

Use Cases

Third party scripts

When third party scripts integrate themselves to web pages today, they cannot do that as ES modules without taking on some risk. That risk varies somewhat, depending on their form of integration.

Injected without developer supervision

That could include third party scripts injected by the CDN, by a CMS or some other automated system that isn’t content-aware.

For such scripts to be loaded as ES modules, they have to make sure that they are not loaded before any import maps in the content.

They can do that by:

  • Loading at the bottom of the page, which may or may not correspond with the point in which they typically need to load in order to function optimally.
  • Buffering the content and validating that no import maps are present, which can incur performance penalties.

Developer-injected snippets

For snippets-based 3Ps, they need to provide instructions so that the developer is aware of import maps in their page and only injects the snippet after it. That may or may not be a realistic thing to ask. It’d definitely increase the integration’s complexity, resulting in a higher percentage of failures or support calls.

Content Management Systems

Content management systems often have markup and code arriving from multiple different sources. Site owners, theme developers and application/extension/plug-in developers all take part in generating the final markup of the page delivered to the user, which often contains lots of scripts. Some of that code can be static, while other parts can vary per user.

If any of that code contains an import map, extreme caution needs to be taken when integrating all these different script entry points, if any of them is an ES module.

Browser Extensions

A similar problem exists with browser extensions, where if extension-injected code wants to use ES modules or import maps, it needs to verify ahead of time that it doesn’t collide with the content itself and where the code is added relative to the rest of the page.

Large Single-Page Apps

Serving hundreds to thousands of different modules is a reality for large SPAs. While bundling is used to speed up the loading-performance cost of modules, in later stages of the application lifetime, it doesn’t always make sense to bundle - while it can reduce the weight of modules over the network (by improving compression ratios), it can also cause over-fetching and less-granular caching which can result in frequent invalidations.

So apps end up with several thousands of modules that may load during the lifetime of the app, using dynamic import.
Using import maps can significantly help such apps avoid cache invalidation cascades, but it also presents a challenge.
An import map for such a site needs to include all the thousands of different modules it may import, and it needs to do that before any module loads. As such, the quite-large import map would be blocking any module-based functionality. That’s a significant performance tradeoff.

Usage examples

There are two cases when rules of the new import map don't get merged into the existing one.

The new import map rule has the exact same scope and specifier as a rule in the existing import map. We'll call that "conflicting rule".

The new import map rule may impact the resolution of an already resolved module. We'll call that "impacted already resolved module".

Two import maps with no conflicts

When the new import map has no conflicting rules, and there are no impacted resolved modules, the resulting map would be a combination of the new and existing maps. Rules that would have individually impacted similar modules (e.g. "/app/" and "/app/helper") but are not an exact match are not conflicting, and all make it to the merged map.

So, the following existing and new import maps:

{
   "imports": {
    "/app/": "./original-app/",
  }
}
{
  "imports": {
    "/app/helper": "./helper/index.mjs"
  },
  "scopes": {
    "/js": {
      "/app/": "./js-app/"
    }
  }
}

Would be equivalent to the following single import map:

{
  "imports": {
    "/app/": "./original-app/",
    "/app/helper": "./helper/index.mjs"
  },
  "scopes": {
    "/js": {
      "/app/": "./js-app/"
    }
  }
}

New import map defining an already-resolved specifier

When the new import map impacts an already resolved module, that rule gets dropped from the import map.

So, if the top-level resolved module set already contains the pair (null, "/app/helper"), the following new import map:

{
   "imports": {
    "/app/helper": "./helper/index.mjs",
    "lodash": "/node_modules/lodash-es/lodash.js"
  }
}

Would be equivalent to the following one:

{
  "imports": {
    "lodash": "/node_modules/lodash-es/lodash.js"
  }
}

New import map defining an already-resolved specifier in a specific scope

The same is true for rules defined in specific scopes. If the resolved module set contains the pair ("/app/main.mjs", "/app/helper"), the following new import map:

{
  "scopes": {
    "/app/": {
      "/app/helper": "./helper/index.mjs"
    },
  }
   "imports": {
    "lodash": "/node_modules/lodash-es/lodash.js"
  }
}

Would similarly be equivalent to:

{
  "imports": {
    "lodash": "/node_modules/lodash-es/lodash.js"
  }
}

The script in the pair is the script object itself, rather than its URL, so these examples are somewhat simplistic in that regard.

Already-resolved specifier and multiple rules redefining it

We could also have cases where a single already-resolved module specifier has multiple rules for its resolution, depending on the referring script. In such cases, only the relevant rules would not be added to the map.

For example, if the rop-level resolved module set contains the pair ("/app/main.mjs", "/app/helper"), the following new import map:

{
  "scopes": {
    "/app/": {
      "/app/helper": "./helper/index.mjs"
    },
    "/app/vendor/": {
      "/app/": "./vendor_helper/"
    },
    "/vendor/": {
      "/app/helper": "./helper/vendor_index.mjs"
    }
  },
   "imports": {
    "lodash": "/node_modules/lodash-es/lodash.js"
    "/app/": "./general_app_path/"
    "/app/helper": "./other_path/helper/index.mjs"
  }
}

Would be equivalent to:

{
  "scopes": {
    "/vendor/": {
      "/app/helper": "./helper/vendor_index.mjs"
    }
  },
  "imports": {
    "lodash": "/node_modules/lodash-es/lodash.js"
  }
}

This is achieved by the fact that the merge algorithm uses a copy of the resolved module set and removes already referring script specifier pairs from it if they already resulted in a rule being ignored.

Two import maps with conflicting rules

When the new import map has conflicting rules to the existing import map, with no impacted already resolved modules, the existing import map rules persist.

For example, the following existing and new import maps:

{
   "imports": {
    "/app/helper": "./helper/index.mjs",
    "lodash": "/node_modules/lodash-es/lodash.js"
  }
}
{
  "imports": {
    "/app/helper": "./main/helper/index.mjs"
  }
}

Would be equivalent to the following single import map:

{
  "imports": {
    "/app/helper": "./helper/index.mjs",
    "lodash": "/node_modules/lodash-es/lodash.js",
  }
}

High-level design

At a high-level, we want a module resolution cache that will ensure that a resolved module identifier always resolves to the same module. That is implemented using the "resolved module set", which ensures that URLs for modules that were already resolved cannot be added to future import maps.

We also want top-level imports that start loading a module tree won’t have that tree change “under their feet” due to an import map that was loaded in parallel. That is achieved by providing a copy of the import maps to the module resolution algorithm of these top-level modules and propagating it recursively down its module tree.

And finally, we want a way to create a single, coherent import map from multiple import map scripts loaded on the document. That is done with the "merge new and existing import maps" algorithm.

(See WHATWG Working Mode: Changes for more details.)


/infrastructure.html ( diff )
/scripting.html ( diff )
/webappapis.html ( diff )

@yoavweiss yoavweiss marked this pull request as draft July 30, 2024 07:55
@yoavweiss yoavweiss marked this pull request as ready for review July 31, 2024 10:57
source Outdated Show resolved Hide resolved
Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up doing a relatively detailed review anyway, although for some repeated editorial issues I stopped commenting.

I have two major questions:

  • The merge algorithm needs examples, and maybe explanatory text. I can follow most of the steps (modulo some bugs), but I can't figure out the intent. The examples can either use the JSON syntax, or the normalized syntax seen below https://html.spec.whatwg.org/#parse-an-import-map-string if that is helpful in giving extra clarity. The impact of the resolved set is particularly unclear.

  • I don't understand why the import map is being passed around so much. There's still always one import map per global, and it's easy to get to that global from any algorithm or from the "script" struct. At least one instance of this seems completely redundant, which I commented on. But e.g. why are you storing the import map in [[HostDefined]]? I realize there's probably some complexity here at the particular point in time when you're merging import maps and thus the global's import map changes, but that should be able to happen completely discretely between script parsing and execution, so I don't see why scripts should need to track individual import maps separate from the global one.

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@yoavweiss yoavweiss changed the title Dynamic module imports Dynamic import maps Aug 1, 2024
Copy link
Contributor Author

@yoavweiss yoavweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed some comments and addressing the rest soon!

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@yoavweiss
Copy link
Contributor Author

  • I don't understand why the import map is being passed around so much. There's still always one import map per global, and it's easy to get to that global from any algorithm or from the "script" struct. At least one instance of this seems completely redundant, which I commented on. But e.g. why are you storing the import map in [[HostDefined]]? I realize there's probably some complexity here at the particular point in time when you're merging import maps and thus the global's import map changes, but that should be able to happen completely discretely between script parsing and execution, so I don't see why scripts should need to track individual import maps separate from the global one.

My thinking was that we need to do that in order to guarantee that once we're parsing a module tree, all modules in the tree would be resolved by the same import map. E.g. I thought it is possible that a setTimeout would inject a new import map while the a module script is being downloaded and parsed, and that new import map would start taking effect after some modules were resolved but before others.
If that's not possible for some reason, I'm happy to revert these changes.

@yoavweiss yoavweiss requested a review from domenic August 1, 2024 18:59
Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very helpful, thanks!

My thinking was that we need to do that in order to guarantee that once we're parsing a module tree, all modules in the tree would be resolved by the same import map. E.g. I thought it is possible that a setTimeout would inject a new import map while the a module script is being downloaded and parsed, and that new import map would start taking effect after some modules were resolved but before others.
If that's not possible for some reason, I'm happy to revert these changes.

That makes perfect sense.

Given this, we should explain this in the spec, maybe around #concept-window-import-map. With a note that in general only the root of a loading operation will access concept-window-import-map, and otherwise it'll be threaded through.

With that frame, auditing all the call sites of concept-window-import-map...

  • "resolve a module integrity metadata" seems suspicious. It should probably get an import map threaded to it?
  • "fetch the descendants of and link" seems suspicious. Shouldn't it be getting threaded an import map from its various callers? (per the diagram above it.)
  • "register an import map" has a broken assert

I'm also a bit unsure now about the cases where an import map is not passed in. When is that possible? (Except workers.) We have fallbacks to the Window's import map in those cases, but I'm now questioning whether they're sound.

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
Copy link
Contributor Author

@yoavweiss yoavweiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! editorial fixes first :)

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@yoavweiss
Copy link
Contributor Author

Given this, we should explain this in the spec, maybe around #concept-window-import-map. With a note that in general only the root of a loading operation will access concept-window-import-map, and otherwise it'll be threaded through.

Added

  • "resolve a module integrity metadata" seems suspicious. It should probably get an import map threaded to it?

This was indeed lacking. Should be fixed now.

  • "fetch the descendants of and link" seems suspicious. Shouldn't it be getting threaded an import map from its various callers? (per the diagram above it.)

Here I think the current state is fine, as this is being called from all the root module entry points. Therefore we don't need to thread the import map into "fetch the descendants of and link", we need it to do the threading to its descendants, which it does by setting the map on the Record.

  • "register an import map" has a broken assert

Indeed!!

@yoavweiss
Copy link
Contributor Author

I'm also a bit unsure now about the cases where an import map is not passed in. When is that possible? (Except workers.) We have fallbacks to the Window's import map in those cases, but I'm now questioning whether they're sound.

Let me try to enumerate the cases:

I think that covers all of them but let me know if I missed something.

@yoavweiss yoavweiss requested a review from domenic August 2, 2024 14:50
Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I think the current state is fine, as this is being called from all the root module entry points. Therefore we don't need to thread the import map into "fetch the descendants of and link", we need it to do the threading to its descendants, which it does by setting the map on the Record.

I think I see. Because the callers all operate on URLs or inline scripts, so they didn't need to do any resolution, and so didn't need an import map. It's only for the descendants that you start doing resolution and thus start needing an import map.

This does have the slightly-strange impact that given something like

<script type=module src=my-script.mjs></script>
<script type=importmap>
...
</script>

the modifications that appear after the <script type=module> will apply to the imports of my-script.mjs, because we delay snapshotting the import map until the response from the server comes back. That seems a bit unfortunate; WDYT?

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@yoavweiss yoavweiss requested a review from domenic August 5, 2024 14:59
@yoavweiss
Copy link
Contributor Author

I think I see. Because the callers all operate on URLs or inline scripts, so they didn't need to do any resolution, and so didn't need an import map. It's only for the descendants that you start doing resolution and thus start needing an import map.

This does have the slightly-strange impact that given something like

<script type=module src=my-script.mjs></script>
<script type=importmap>
...
</script>

the modifications that appear after the <script type=module> will apply to the imports of my-script.mjs, because we delay snapshotting the import map until the response from the server comes back. That seems a bit unfortunate; WDYT?

Forgot to address this part.. I agree that this would be weird, and hence it'd be better to pipe in the import map in those cases. I'll do that.

@yoavweiss
Copy link
Contributor Author

Forgot to address this part.. I agree that this would be weird, and hence it'd be better to pipe in the import map in those cases. I'll do that.

Done!

Copy link
Member

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an idea for something that might clean things up. Basically, I find the importMap being optional confusing. It's hard to know whether an algorithm is not getting an import map because we forgot, or because we're coming from a worker. And the fact that sometimes we fall back to the Window's import map, even though we're not in an obviously "top level" algorithm, is extra confusing. (For example, in "create a JavaScript module script.)

I think the following would clean that up:

  • Move "Window's import map" (and Window's resolved module set?) to all global objects. Put them next to https://html.spec.whatwg.org/#in-error-reporting-mode . Add a note explaining that only Window objects have their import maps modified away from the initial empty import map, for now.
  • Make all import map arguments mandatory.
  • Always grab the global object's import map when appropriate. This should now be obviously only at top-level situations.

WDYT?

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@yoavweiss
Copy link
Contributor Author

I have an idea for something that might clean things up. Basically, I find the importMap being optional confusing. It's hard to know whether an algorithm is not getting an import map because we forgot, or because we're coming from a worker. And the fact that sometimes we fall back to the Window's import map, even though we're not in an obviously "top level" algorithm, is extra confusing. (For example, in "create a JavaScript module script.)

I think the following would clean that up:

  • Move "Window's import map" (and Window's resolved module set?) to all global objects. Put them next to https://html.spec.whatwg.org/#in-error-reporting-mode . Add a note explaining that only Window objects have their import maps modified away from the initial empty import map, for now.
  • Make all import map arguments mandatory.
  • Always grab the global object's import map when appropriate. This should now be obviously only at top-level situations.

WDYT?

SG. done!

@yoavweiss
Copy link
Contributor Author

In case it's useful for the review process - I mapped the high-level relevant spec changes to Chromium's code.

@Jamesernator
Copy link

Jamesernator commented Aug 10, 2024

At the same time, the current static design gives us determinism and isn’t racy. A module identifier that resolves to a certain module will continue to do so throughout the lifetime of the document. It would be good to keep that characteristic.

Has an alternative design been considered that doesn't mutate a shared global map?

For example perhaps associating maps to individual script tags?:

<!-- importmap="..." would only affect the loading of this graph -->
<script type="module" src="./entry.js" importmap="./entry.importmap.json"></script>

This alternative design could even allow for explaining importmaps in terms of import attributes, i.e. if you want to load a third-party module with a third-party importmap you could do:

import thirdParty from "./dist/third-party.js" with { importmap: "./dist/third-party.importmap.json" }

@jeff-hykin
Copy link

jeff-hykin commented Aug 10, 2024

I almost gave a nearly-equivlent comment yesterday but was afraid I was misunderstanding something. So I'm glad you spoke up @Jamesernator.

My draft example was literally:

await import("./dist/third-party.js", { withMap: "./dist/third-party.importmap.json"})
import thirdParty from "./dist/third-party.js" withMap "./dist/third-party.importmap.json"

Concerns with mutating a shared global map

  1. I don't see any discussion about risks, possible adverse side effects, or security.

Consider a dynamic const bcrypt = await import("/path/to/bcrypt.js") written by a site author. Right now they have full confidence that the import will either load what they expect or throw an error. I'm not convinced that letting browser extensions break that assumption won't lead to new attack vectors. While I believe browser extensions can already affect the top level import map, its still not clear to me that dynamic changes are not more risky.

  1. I don't see discussion about adverse side effects.

As I understand it, the proposal has side effects due to this being a global map. E.g. loading a brower extension can change the import behavior of non-browser-extension imports. Extensions often do not want side effects: case-in-point the motivating usecase at the top (this one) does not want side effects, it just wants to map-imports for extension1, not for extension2.

If extension1 global-mutates import-map for A
If extension2 global-mutates import-map for A

IMO extension-loading order creates a unnecessary race condition where extension1 overrides extnesion2's import map, along with extension2's developer having no good way to debug when-and-why their extension broke. (And even once they do figure out why there is basically no solution since telling the user to change the load order is impractical, they have to go back to square1 of bundling everything to be reliable)

Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing to see this, I only had time to do a very very brief review but I really like the overall approach. Specifically, I have some concerns about algorithmic complexity, but haven't had a chance to look more carefully at the algorithm to determine if they are "resolvable" yet.

Then for the copying approach - if we have a well-defined deduping that won't change the nature of mappings, what is the reason for wanting to lock down resolutions during individual load operations? Will this locked map also affect dynamic import or is it only cloned through the static graph? And if so, it might seem strange having different resolution rules for dynamic import and static import, especially when ECMA-262 also maintains its own cache to ensure these are consistent for known imports.

source Outdated Show resolved Hide resolved
source Outdated Show resolved Hide resolved
@bakkot
Copy link
Contributor

bakkot commented Aug 11, 2024

I probably won't have time to do a thorough review of this, but @michaelficarra and I spent quite a while thinking about (some parts of) this problem back when import maps were first being discussed, so I want to call @yoavweiss's attention to this issue which @michaelficarra wrote at the time.

I don't see anything in the OP about the case where the second import map maps to something which is already mapped by the first. (Maybe I missed it?) For example:

First:

{
  "imports": {
    "/app/helper": "./helper/index.mjs"
  }
}

Second:

{
  "imports": {
    "helper": "/app/helper"
  }
}

What's the intended behavior in this case? My inclination is to say that this is equivalent to

{
  "imports": {
    "/app/helper": "./helper/index.mjs",
    "helper": "./helper/index.mjs",
  }
}

i.e., imports on the RHS of later maps are resolved in the context of earlier maps. This is to preserve the property that if you have a script with import "/foo", you can replace this with a script with import "bar" plus an import map of "bar": "/foo". Plus, one of the use cases for import maps is to rewrite the location of some dependency, and that only work if someone can't re-introduce the original location of the dependency by adding their own import map later.

This is consistent with the way the web usually works: if one script does globalThis.fetch = decorate(globalThis.fetch), and a later script does globalThis.fetch = decorate2(globalThis.fetch), the second script will wrap the fetch from the first script, not the original fetch.

@hiroshige-g
Copy link
Contributor

hiroshige-g commented Sep 11, 2024

#10528 (comment)

The behavior without copying the import maps per-module-tree:

If root1.js is loaded before the import map update, then the only rule of the updated import map "./app.js": "./app2.js" is rejected, so all specifiers are resolved using an empty import map (the same as above).

Otherwise,

  • ./app.js in root1.js is resolved to app2.js using the updated import map.
  • ./app.js in root2.js is resolved to app2.js using the updated import map.
  • ./app.js in sub.js is resolved either to app2.js.

So more like ./app.js specifiers resolution results are all app.js or all app2.js (not the mixture of the two), depending on loading timing.

In summary,

./app.js in root1.js, root2.js, sub.js are resolved to:

  • With per-module-tree copying: either of:
    • app.js, app.js, app.js
    • app.js, app2.js, app.js
    • app.js, app2.js, app2.js
  • Without per-module-tree copying: either of:
    • app.js, app.js, app.js
    • app2.js, app2.js, app2.js

@yoavweiss
Copy link
Contributor Author

log n part: just because I basically tend to use O(log n) (tree-based data structure such as std::map) for map/set operations to limit the upper bound/worst-case estimate, instead of O(1), but I don't feel this is a big difference (and the current PR doesn't assume any underlying implementation of map/set).

n part: I just multiply by n to get the total time for all n resolutions. Still it's O(bs log n) or O(bs) per resolution, depending on per-map/set operation complexity.

b + s or bs part: The current PR itself is O(b + s) per resolution, while I was assuming https://github.com/whatwg/html/pull/10528/files#r1755220941 that leads O(bs) per resolution.

Fair. Thanks for clarifying!

@yoavweiss
Copy link
Contributor Author

The behavior without copying the import maps per-module-tree:

If root1.js is loaded before the import map update, then the only rule of the updated import map "./app.js": "./app2.js" is rejected, so all specifiers are resolved using an empty import map (the same as above).

Otherwise,

  • ./app.js in root1.js is resolved to app2.js using the updated import map.
  • ./app.js in root2.js is resolved to app2.js using the updated import map.
  • ./app.js in sub.js is resolved either to app2.js.

So more like ./app.js specifiers resolution results are all app.js or all app2.js (not the mixture of the two), depending on loading timing.

In summary,

./app.js in root1.js, root2.js, sub.js are resolved to:

  • With per-module-tree copying: either of:

    • app.js, app.js, app.js
    • app.js, app2.js, app.js
    • app.js, app2.js, app2.js
  • Without per-module-tree copying: either of:

    • app.js, app.js, app.js
    • app2.js, app2.js, app2.js

I'm now convinced that this is better. I'll remove the piping of import maps and return to using a global one everywhere.

result">creating an import map parse result</span> given <var>source text</var> and
<var>base URL</var>.</p></li>
result">creating an import map parse result</span> given <var>source text</var>,
<var>base URL</var>, and <var>settings object</var>.</p></li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can revert this change (creating an import map parse result doesn't take settings object as an argument)

<p>If <var>el</var> does not have an <code data-x="attr-script-integrity">integrity</code>
attribute, then set <var>options</var>'s <span
data-x="concept-script-fetch-options-integrity">integrity metadata</span> to the result of
<span>resolving a module integrity metadata</span> with <var>url</var> and
<var>settings object</var>.</p>
<span>resolving a module integrity metadata</span> with <var>url</var> and <var>settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

<p>To <dfn data-x="resolving a module integrity metadata">resolve a module
integrity metadata</dfn>, given a <span>URL</span> <var>url</var> and an <span>environment
settings object</span> <var>settingsObject</var>:</p>
<p>To <dfn data-x="resolving a module integrity metadata">resolve a module integrity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

<li>
<p>If <var>result</var> is not null, optionally <span data-x="fetch the descendants of and
link a module script">fetch the descendants of and link</span> <var>result</var> given
<var>settingsObject</var>, <var>destination</var>, and an empty algorithm.</p>
<var>settingsObject</var>, <var>destination</var>, and an empty algorith.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

@@ -106952,7 +106978,8 @@ document.querySelector("button").addEventListener("click", bound);
accepting null (on failure) or a <span>module script</span> (on success).</p>

<ol>
<li><p><span>Disallow further import maps</span> given <var>settingsObject</var>.</p></li>
<li><p><span>Add module to resolved module set</span> given <var>settingsObject</var>,
Copy link
Contributor

@hiroshige-g hiroshige-g Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether we should call this, as url doesn't go through the module resolution (i.e. this is unaffected regardless of import maps).

(Probably Add module to resolved module set should be better called from resolve-a-module-specifier, to make sure Add module to resolved module set is called if and only if a module specifier resolution happens. (I might revisit this after reading the rest of the PR) Already called, nevermind)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if we remove these Add module to resolved module set callers, the second argument of Add module to resolved module set can be now string rather than string-or-null.

@@ -106978,7 +107005,8 @@ document.querySelector("button").addEventListener("click", bound);
script</span> (on success).</p>

<ol>
<li><p><span>Disallow further import maps</span> given <var>settingsObject</var>.</p></li>
<li><p><span>Add module to resolved module set</span> given <var>settingsObject</var>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above; url for modulepreload also doesn't go through import maps.

of and link</span> <var>result</var> given <var>fetchClient</var>,
<var>destination</var>, and <var>onComplete</var>. If <var>performFetch</var> was given, pass
it along as well.</p></li>
of and link</span> <var>result</var> given <var>fetchClient</var>, <var>destination</var>, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

data-x="fetching-scripts-perform-fetch">perform the fetch hook</span> <var>performFetch</var>, run
these steps. <var>onComplete</var> must be an algorithm accepting null (on failure) or a
<span>module script</span> (on success).</p>
data-x="concept-request-destination">destination</span> <var>destination</var>, an algorithm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

data-x="fetching-scripts-perform-fetch">perform the fetch hook</span> <var>performFetch</var>, run
these steps. <var>onComplete</var> must be an algorithm accepting null (on failure) or a
<span>module script</span> (on success).</p>
<var>referrer</var>, a <span>ModuleRequest Record</span> <var>moduleRequest</var>, an algorithm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

<li><p><span>Assert</span>: <var>moduleRequest</var>.[[Attributes]] does not contain any <span>Record</span>
<var>entry</var> such that <var>entry</var>.[[Key]] is not "<code data-x="">type</code>", because
we only asked for "<code data-x="">type</code>" attributes in
<li><p><span>Assert</span>: <var>moduleRequest</var>.[[Attributes]] does not contain any
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

<var>settingsObject</var>:</p>
<p>To <dfn>add module to resolved module set</dfn> given an <span>environment settings
object</span> <var>settingsObject</var>, a <span>string</span>-or-null
<var>referringScriptURL</var>, and a <span>string</span> <var>specifier</var>:</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about renaming referringScriptURL to baseURLString?
(To align with the callsite == resolve-a-module-specifier, that can pass the Document base URL as the argument)

<p>A <dfn>referring script specifier pair</dfn> is a <span>struct</span>. It has the following
<span data-x="struct item">items</span>:</p>
<dl>
<dt>A <dfn data-x="pair-referring-script">referring script</dfn></dt>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we rename referring script specifier pair, referring script and its description to baseURL-based?(rather than referring script)? (See https://github.com/whatwg/html/pull/10528/files#r1767386685)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit optional: I prefer renaming referring script specifier pair to e.g. specifier resolution record (i.e. indicating the higher-level view that a struct here represents a module specifier resolution, rather than underlying members/types).

@@ -109832,7 +110167,7 @@ import "https://example.com/foo/../module2.mjs";</code></pre>
<li><p>Set <var>specifier</var> to ? <span>ToString</span>(<var>specifier</var>).</p></li>

<li><p>Let <var>url</var> be the result of <span data-x="resolve a module specifier">resolving
a module specifier</span> given <var>moduleScript</var> and <var>specifier</var>.</p></li>
a module specifier</span> given <var>moduleScript</var>, and <var>specifier</var>.</p></li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

module specifier</span> given <var>referencingScript</var> and
<var>moduleRequest</var>.[[Specifier]], catching any exceptions. If they throw an exception, let
<var>resolutionError</var> be the thrown exception.</p></li>
module specifier</span> given <var>referencingScript</var>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change is formatting only and can be reverted to make the PR smaller.

<span>resolved module set</span>:</p>

<ol>
<li><p>If <var>pair</var>'s <span
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should mirror the Step 9.1 of https://html.spec.whatwg.org/multipage/webappapis.html#resolve-a-module-specifier, e.g.

  • If scopePrefix is pair's referring script, or if scopePrefix ends with U+002F (/) and scopePrefix is a code unit prefix of pair's referring script, then:
    • For each specifier -> url of scopedImports, ...

The point (affecting behavior) is we should add check for trailing slash, to avoid matching e.g. scopePrefix = https://example.com/foo and pair's referring script https://example.com/foobar.

(nit optional: I also prefer mirroring the step without negating here to make it clearer they are the same condition, but more like personal preference)


<ol>
<li>
<p>If <var>pair</var>'s <span data-x="pair-specifier">specifier</span> <span>starts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, this also mirror Step 1.1 and Step 1.2 of https://html.spec.whatwg.org/multipage/webappapis.html#resolving-an-imports-match, e.g.

  • For each specifierKey → resolutionResult: // I feel using the same name as #resolving-an-imports-match is better, also to avoid confusion between pair's specifier and specifier.
    • If specifierKey is pair's specifier, or
    • If all of the following are true:
      • specifierKey ends with U+002F (/);
      • specifierKey is a code unit prefix of pair's specifier; and
      • either asURL is null, or asURL is special,

One key point here is checking the "specifierKey ends with U+002F (/)" condition, which affects the behavior.

However, I'm less sure about the third asURL condition, because

  • it complicates the spec and impl: probably we should record the boolean in the resolved module set indicating the "asURL is null, or asURL is special" bit.
  • The behavior win here is more subtle: i.e. this condition is used to disallow prefix matching in non-special-URL-specifiers, so the behavior affected by not adding this condition here would be allowing non-special-scheme:foo/ in the new map to be added even if there is existing module resolution non-special-scheme:foo/bar.

So adding the asURL condition is more like for a little more completeness and consistency.

So it's fine with me either to drop or include the asURL condition here.
WDYT?

data-x="concept-import-map-scopes">scopes</span>.</p>

<p class="note">We're mutating these copies and removing items from them when they are used to
ignore scope-specific rules. That enables keeping less-specific rules that would match the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That enables keeping less-specific rules that would match the same specifier.

This is now obsolete.

</li>
</ol>

<p class="note">Implementers are encouraged to implement a more efficient matching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be a little more specific about the intention, or not?

During the review of the spec PR (as well as the impl CL), #10528 (comment) (i.e. we should avoid O(n) time for n module specifier resolutions as n can be ~1000) provided the clear guidance on how much efficient the implementation should be, so I feel it would be also helpful in the spec here.

Or is it also the implementors' discretion as to whether we should aim less than O(n) or not? Maybe, so I'm less sure about this. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

9 participants