Skip to content

Conversation

@zolkis
Copy link
Collaborator

@zolkis zolkis commented Oct 8, 2025

Fixes #815

As explained in #884, add context options and attributes/internal slots that can be used for conveying application hints wrt the preferred acceleration type (CPU or massively parallel processing, i.e. NPU or GPU).

This is a minimal change, and we might want to refine more the algorithms wrt. context power preferences and acceleration options (currently not addressed). These could be done in this PR, or in a separate subsequent PR.


Preview | Diff

…d the poll CPU fallback status steps. Invoke it from graph.dispatch().

Signed-off-by: Zoltan Kis <[email protected]>
@anssiko
Copy link
Member

anssiko commented Oct 21, 2025

@zolkis thank you for formalizing the group’s current thinking into this PR!

@huningxin @RafaelCintron, this spec PR is on the WebML WG Teleconference – 23 October 2025 agenda. Reviews, comments, questions prior in this PR appreciated.

@handellm to check we remain aligned with Google Meet requirements.

FYI @mtavenrath who expressed interest in this space.

@handellm
Copy link

Seems good!

<dd>Prioritizes power consumption over other considerations such as execution speed.</dd>
</dl>

The <dfn dfn-for=MLContextOptions dfn-type=dict-member>accelerated</dfn> option indicates the application's preference as related to massively parallel acceleration. When set to `true` (by default), the underlying platform will attempt to use the available massively parallel accelerators, such as GPU or NPU, also depending on the {{MLContextOptions/powerPreference}}. When set to `false`, the application hints to prefer CPU inference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The <dfn dfn-for=MLContextOptions dfn-type=dict-member>accelerated</dfn> option indicates the application's preference as related to massively parallel acceleration. When set to `true` (by default), the underlying platform will attempt to use the available massively parallel accelerators, such as GPU or NPU, also depending on the {{MLContextOptions/powerPreference}}. When set to `false`, the application hints to prefer CPU inference.
The <dfn dfn-for=MLContextOptions dfn-type=dict-member>accelerated</dfn> option indicates the application's preference as related to massively parallel acceleration. When set to `true` (by default), the underlying platform will attempt to use the available massively parallel accelerators, such as a GPU or NPU, also depending on the {{MLContextOptions/powerPreference}}. When set to `false`, the application indicates it prefers unaccelerated CPU inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Query supported devices before graph compilation

6 participants