fix(namer): escape, rather than strip, non-ASCII ident. characters #7995

ErichDonGubler · 2025-07-23T19:12:39Z

Escape non-ASCII identifier characters with write!(…, "u{:04x}", …), surrounding with _ as appropriate. This solves (1) a debugging issue where stripped characters would otherwise be invisible, and (2) failure to re-validate that stripped identifiers didn't start with an ASCII digit.

I've confirmed that this fixes bug 1978197 on the Firefox side.

Testing

Added a regression test.

Squash or Rebase?

squashplz

Checklist

If this contains user-facing changes, add a CHANGELOG.md entry.

naga/tests/out/spv/wgsl-7995-unicode-idents.spvasm

andyleiserson

+1 for not generating illegal identifiers 😄

andyleiserson · 2025-07-25T16:25:17Z

naga/src/proc/namer.rs

+                    if !s.is_empty() && !had_underscore_at_end {
+                        s.push('_');
+                    }
+                    write!(s, "u{:04x}_", c as u32).unwrap();


Not important, since you've covered it with snapshot tests, but one thing I've noticed about naga/wgpu is that we don't have a lot of unit tests, and this behavior seems like a good candidate for unit testing. (But as I said, not important, I don't think it's worth going back and changing/adding the tests, this is more a reminder to be thinking about unit tests in general.)

Agreed on the scope; I also think it would be nice to use several snapshot-ish unit tests in follow-up work, just to make it easier to reason about some changes.

andyleiserson · 2025-07-25T16:38:31Z

naga/tests/out/glsl/wgsl-atomicCompareExchange.test_atomic_compare_exchange_i32.Compute.glsl

@@ -5,11 +5,11 @@ precision highp int;

 layout(local_size_x = 1, local_size_y = 1, local_size_z = 1) in;

-struct _atomic_compare_exchange_resultSint4_ {
+struct _atomic_compare_exchange_result_u003c_Sint_u002c_4_u003e {


It's slightly unfortunate to use so many characters for an internally-generated type name. Maybe there could be a rule like "any number of consecutive :<>, characters are mapped to a single underscore"?

Maybe something like 3a4c931?

Yes, that looks great to me.

ErichDonGubler · 2025-07-25T22:01:49Z

Just filed https://treeherder.mozilla.org/jobs?repo=try&landoCommitID=144433 to see if this breaks anything. AFAIK it's unusual for us to change stuff like this, though I'm not concerned about anything concrete.

jimblandy

I'm not really comfortable with the way this affects so many identifiers that don't contain non-ASCII characters. Naga does not promise to preserve identifier names at all; we could just name everything e1, e2. There's no benefit to us guaranteeing that the original identifier can be reconstructed from the identifiers we generate.

jimblandy · 2025-07-31T15:05:37Z

The only job of Namer::sanitize is to produce a valid identifier prefix quickly and simply. Preserving the original name is just for our convenience in debugging; there is no contract. We should not double the size of this code just to delete characters safely.

For example, a better fix might be to delete almost all of that function, and then say, if string works as-is, return it; otherwise, return "e". That's the kind of direction we want to be headed here, not growing our own identifier mangling syntax.

Escape non-ASCII identifier characters with `write!(…, "u{:04x}", …)`, surrounding with `_` as appropriate. This solves (1) a debugging issue where stripped characters would otherwise be invisible, and (2) failure to re-validate that stripped identifiers didn't start with an ASCII digit. I've confirmed that this fixes [bug 1978197](https://bugzilla.mozilla.org/show_bug.cgi?id=1978197) on the Firefox side.

ErichDonGubler added type: bug Something isn't working naga Shader Translator area: naga processing Passes over IR in the middle labels Jul 23, 2025

ErichDonGubler force-pushed the escape-utf-idents branch 2 times, most recently from e6b5270 to 0108909 Compare July 23, 2025 19:19

ErichDonGubler marked this pull request as ready for review July 23, 2025 19:19

ErichDonGubler commented Jul 24, 2025

View reviewed changes

naga/tests/out/spv/wgsl-7995-unicode-idents.spvasm Outdated Show resolved Hide resolved

ErichDonGubler force-pushed the escape-utf-idents branch from 0108909 to 09a15fd Compare July 25, 2025 03:32

andyleiserson approved these changes Jul 25, 2025

View reviewed changes

cwfitzgerald assigned andyleiserson Jul 30, 2025

jimblandy requested changes Jul 31, 2025

View reviewed changes

ErichDonGubler added 2 commits August 5, 2025 20:37

style(CHANGELOG): strip trailing whitespace

9d0ac89

ErichDonGubler force-pushed the escape-utf-idents branch from 09a15fd to 63dcc20 Compare August 6, 2025 00:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(namer): escape, rather than strip, non-ASCII ident. characters #7995

fix(namer): escape, rather than strip, non-ASCII ident. characters #7995

Uh oh!

ErichDonGubler commented Jul 23, 2025 •

edited

Loading

Uh oh!

Uh oh!

andyleiserson left a comment

Uh oh!

andyleiserson Jul 25, 2025

Uh oh!

ErichDonGubler Jul 25, 2025

Uh oh!

andyleiserson Jul 25, 2025

Uh oh!

ErichDonGubler Jul 25, 2025

Uh oh!

andyleiserson Jul 25, 2025

Uh oh!

ErichDonGubler commented Jul 25, 2025

Uh oh!

jimblandy left a comment

Uh oh!

jimblandy commented Jul 31, 2025 •

edited

Loading

Uh oh!

Uh oh!

fix(namer): escape, rather than strip, non-ASCII ident. characters #7995

Are you sure you want to change the base?

fix(namer): escape, rather than strip, non-ASCII ident. characters #7995

Uh oh!

Conversation

ErichDonGubler commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

andyleiserson left a comment

Choose a reason for hiding this comment

Uh oh!

andyleiserson Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

ErichDonGubler Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

andyleiserson Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

ErichDonGubler Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

andyleiserson Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

ErichDonGubler commented Jul 25, 2025

Uh oh!

jimblandy left a comment

Choose a reason for hiding this comment

Uh oh!

jimblandy commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ErichDonGubler commented Jul 23, 2025 •

edited

Loading

jimblandy commented Jul 31, 2025 •

edited

Loading