Skip to content

Conversation

ehuss
Copy link
Contributor

@ehuss ehuss commented Oct 12, 2025

This changes underscore from being a punctuation character to a keyword. This is intended to help better align with proc-macros, which treat it as an Ident.

Note one unusual rule is inline assembly ParamName which is IDENTIFIER_OR_KEYWORD. From what I can tell, it does accept _, but the fmt template does not. Templates are not specified in great detail in the std docs, and don't touch on this fact.

Closes #1236
Closes #2020

cc @petrochenkov @mattheww

This changes underscore from being a punctuation character to a keyword.
This is intended to help better align with proc-macros, which treat it
as an [`Ident`](https://doc.rust-lang.org/proc_macro/struct.Ident.html).

Note one unusual rule is inline assembly `ParamName` which is
`IDENTIFIER_OR_KEYWORD`. From what I can tell, it does accept `_`, but
the fmt template does not. Templates are not specified in great detail
in the std docs, and don't touch on this fact.

Closes rust-lang#1236
Closes rust-lang#2020
@rustbot rustbot added the S-waiting-on-review Status: The marked PR is awaiting review from a maintainer label Oct 12, 2025
RAW_IDENTIFIER -> `r#` IDENTIFIER_OR_KEYWORD _except `crate`, `self`, `super`, `Self`, `_`_
NON_KEYWORD_IDENTIFIER -> IDENTIFIER_OR_KEYWORD _except a [strict][lex.keywords.strict] or [reserved][lex.keywords.reserved] keyword_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RESERVED_RAW_IDENTIFIER a few lines below may be redundant now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I spent some time thinking about this and how to correctly express that these keywords are rejected. The current except clause didn't express that in the way that I was intending. I pushed up a commit that instead of removing the reserved rule, it moves the except part into the reserved rule.

Or, to put it in another way, r#crate is a token, it's just rejected as an error. The previous lexical grammar wasn't really conveying that.

* `expr`: an [Expression]
* `expr_2021`: an [Expression] except [UnderscoreExpression] and [ConstBlockExpression] (see [macro.decl.meta.edition2024])
* `ident`: an [IDENTIFIER_OR_KEYWORD], [RAW_IDENTIFIER], or [`$crate`]
* `ident`: an [IDENTIFIER_OR_KEYWORD] except `_`, [RAW_IDENTIFIER], or [`$crate`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pre-existing, but this doesn't seem correct, ident accepts raw identifiers and expanded $crate (and unexpanded $crate is two tokens and not IDENTIFIER_OR_KEYWORD).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we have a few issues related to this (like #588 and #587). I was also uneasy adding this in the first place.

I don't know the right way to document that the expanded $crate can be accepted by ident. Perhaps this needs to just be more explicit what it means?

This reworks the reserved raw identifier and lifetimes to hopefully more
clearly express what they mean.

The "except" clauses in the raw identifier were intended to mean a set
subtraction, not an explicit "and it is an error if it is specified".
Using set subtraction isn't correct because that would mean `r#crate`
would be interpreted as 3 tokens (since RAW_IDENTIFIER did not match it,
but IDENTIFIER_OR_KEYWORD PUNCTUATION IDENTIFIER_OR_KEYWORD would).

I also reordered Token, since the intent is that the first production
in an alternation that matches wins. The idea here is to make the
reserved tokens high priority, so that they clearly match first and
cause an error. (I did not exhaustively analyze the rest of the rules
to see if they follow that behavior, that is for another day.)

It could be said it would be nice to document the rationale for the
restrictions (rust-lang#2042), but
that is a bigger ask.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: The marked PR is awaiting review from a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Further cases where lone underscore isn't covered Grammar for "identifier or keyword" seems wrong

3 participants