Skip to content

Conversation

nicolo-ribaudo
Copy link
Member

The goal of this PR is to make sure that the two import "./foo.js" declarations in this file import the same module and evaluate it only once (point 2 of #68):

import "./foo.js";
import(module {
  import "./foo.js";
});

Note that we already have this guarantee for

import "./foo.js";
import "./foo.js";

This is done by passing down the top-level Module/Script Record to the inner module blocks, so that the per-Realm cache key becomes (top-level Module/Script Record of the referencing Module/Script Record, specifier) instead of just (referencing Module/Script Record, specifier).

There are some problems with this approach because it makes module blocks non-serializable (because they now transitively contain a reference to a whole Environment Record): I plan to allow hosts to provide their own representation of the top-level Module/Script Record, and in practice HTML will use the file URL. The cache key would thus become (referrer URL, specifier). (if you read the diff, I left a TODO for this)

Hosts can now use this top-level Module/Script Record representation stored on the Module Record to build import.meta.url, so we don't need to attach special [[HostDefined]] data on ModuleBlock objects and I removed the HostInitializeModuleBlock host hook. In practice, HTML will do something like import.meta.url = moduleRecord.[[TopLevelScriptOrModule]].[[URL]].
We might still need to allow hosts to inject some [[HostDefined]] data in the Module Record created when importing module blocks, but that would be a new host hook called by ImportModuleBlockDynamically and not when a ModuleBlock is defined. (if you read the diff, I left a TODO for this)

<pre><code class="html">&lt;button type="button" onclick="import(module {})"&gt;Click me&lt;/button&gt;</code></pre>

<p><ins>there will be no active script or module at the time the <emu-xref href="#sec-import-calls">`module {}`</emu-xref> expression runs.</ins></p>
</emu-note>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one way to achieve it, but there is another way, and that is to simply impose the invariant in a top down way.

Basically spec language doesn't need to define the data constructs like you would in a programming model, spec language defines invariants.

We can define the invariant quite fine without having this field by just imposing it as an invariant that the resolultion should be the same for all blocks and top-level modules.

This then relaxes the requirements on hosts in maintaining that linkage so that eg the top-level module could be GC'd fine in future while the module block sticks around.

Basically simply a text invariant might suffice for HostResolveImportedModule as described in https://github.com/tc39/proposal-js-module-blocks/pull/68/files#r912411086.

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This then relaxes the requirements on hosts in maintaining that linkage

Since resolution/evaluation of module blocks is done within ECMA-262 without going through host hooks, I don't think that hosts alone can maintain that linkage. But you are right, we can avoid storing the whole Module Record to make GC easier (technically GC is already allowed as long as you replace them with something that "maintains" identity since we never check the contents of the [[TopLevelModuleOrScript]] slot, but yes it's hard with this PR).

I also think that we should try to use as less "host invariants" as possible here: in a compartments/loaders/reified-modules world, these host hooks will be exposed to user code and it might be hard/impossible to force them to respect the invariants without specifying how we enforce them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specify that, just like Proxy does - failing to satisfy invariants throws an exception.

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case checking if an invariant is satisfied is as much work for an engine as doing the bookkeeping work itself.

For example, in this case:

// ./main.js

import * as x from "./foo.js";

await import(module {
  import * as y from "./foo.js";
});

If you have a loader with importHook(importerURL: string, specifier: string), we can specify two alternatives:

  • We call importHook("./main.js", "./foo.js") only the first time, and use its result for both imports
  • We call importHook("./main.js", "./foo.js") twice, and throw if it has two different results

In both cases the engine needs to keep a reference to importHook("./main.js", "./foo.js")'s first result, so the second option adds burden on the JS dev (since they need to manually cache importHook results) without removing any complexity from the engine.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current situation is that we have an invariant that is defined but cannot be verified by the spec. It's definitely worth considering if we can improve that but certainly not a given either. Perhaps ECMA-262 should just directly handle the resolution caching itself then? That might also help as a way to build the invariant more easily for custom loaders? It's worth noting that import maps in browsers also seem to be heading in this direction (guybedford/import-maps-extensions#17 (comment)). Not sure just asking the questions.

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh well, the current spec text of this proposal does something like that! It was needed to specify how caching works with module blocks. There is a new realm.[[ModuleMap]] list which is basically a (importer, specifier)->resolvedModule map: https://tc39.es/proposal-js-module-blocks/#sec-code-realms.

I proposed adding it directly to ECMA-262, but the editors discussed about it in their weekly call and asked to bring it in with a proposal that needs it (tc39/ecma262#2806 (comment)).

<h1>
<ins>
GetTopLevelScriptOrModule (
GetHostDefinedResolutionBase (

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting approach, could we make this entirely align with the [[HostDefined]] field already existing on module records?

Copy link
Member Author

@nicolo-ribaudo nicolo-ribaudo Jul 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTML's [[HostDefined]] keeps some data (for example, a list of unhandled rejected promises) that is not serializable, so copying it from the outer Module Record to ModuleBlock (so that then it can be copied to the ModuleBlock's Module Record) would make them non-serializable.

The idea here is to split the [[HostDefined]] slot in two parts, so that we only carry over the (serializable) part needed to resolve modules but not the "runtime" data related to a specific Module Record.

Comment on lines +479 to +480
<del>_referencingScriptOrModule_: a Script Record, a Module Record, or *null*,</del>
<ins>_resolutionBase_: unknown,</ins>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried looking at the HTML spec to integrate this change, and we still need to pass the whole referencingScriptOrModule Record. This is because HTML stores a reference to the module map (their (url, moduleType) -> Module Record cache) in every Module Record's [[HostDefined]] slot, and it needs to access it in HostResolveImportedModule.

I think that it could be stored in the Realm's [[HostDefined]] slot instead, rather than being explicitly passed around between modules, but it looks like a big refactor for a very small gain. @guybedford / @littledan maybe you know why it's stored in every module and not at the Realm level?

@nicolo-ribaudo
Copy link
Member Author

I'm closing this because I want to pursue a different approach. Since this PR is tightly coupled to the HTML integration, I want to first define exactly how things should behave and then worry about which parts are in 262 and which are in HTML.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants