Skip to content

Conversation

@stephentoub
Copy link
Member

Fixes #63502
Fixes #63512

For every regex, we currently emit a private type next to the partial member we're implementing, and that private type contains all the code for the regex, including all the helpers it needs. There are multiple issues with this:

  • We're emitting a lot of duplicated code into an assembly that has multiple/many uses of [RegexGenerator(...)]. System.Private.Xml has ~10 uses of it now and it has ~1000 lines of unnecessarily duplicative code being generated.
  • We have to global::-qualify all uses of types we rely on in the implementation, because C# binding rules would otherwise bind to things declared inside the user's partial type we're generating into. This is particularly bad for extension methods, which we then can't use as extensions, and we make heavy use of MemoryExtensions throughout the generated code for all of our interactions with spans.
  • We could avoid the code duplication by putting all the helpers into a shared utilities type, but then that type would need to be internal and would effectively be shipping surface area we don't want to ship.
  • Code inside that user's type can see the private type we're emitting, and EditorBrowsable doesn't currently help (EditorBrowsable is useless for source generators to hide stuff roslyn#60080).

With this PR, we instead move all of the types into a separate namespace and parent class, which then enables a) having usings inside the namespace declaration so that we can avoid fully-qualifying everything and b) separating out all of the helper methods into a class that's then private to the parent wrapper class, shielding it from the rest of the assembly.

Hopefully C# 11 will see a solution to dotnet/csharplang#5529, at which point we can use that to hide all of the separated out generated code other than the partial method definitions (with the changes in this PR, doing so should involve minimal tweaks to the code generation). Until then, we still have a similar problem as we do today for the generated Regex-derived types being visible to other code in the assembly, but at least the shared helpers aren't.

cc: @joperezr, @chsienki

@ghost
Copy link

ghost commented Mar 10, 2022

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

Fixes #63502
Fixes #63512

For every regex, we currently emit a private type next to the partial member we're implementing, and that private type contains all the code for the regex, including all the helpers it needs. There are multiple issues with this:

  • We're emitting a lot of duplicated code into an assembly that has multiple/many uses of [RegexGenerator(...)]. System.Private.Xml has ~10 uses of it now and it has ~1000 lines of unnecessarily duplicative code being generated.
  • We have to global::-qualify all uses of types we rely on in the implementation, because C# binding rules would otherwise bind to things declared inside the user's partial type we're generating into. This is particularly bad for extension methods, which we then can't use as extensions, and we make heavy use of MemoryExtensions throughout the generated code for all of our interactions with spans.
  • We could avoid the code duplication by putting all the helpers into a shared utilities type, but then that type would need to be internal and would effectively be shipping surface area we don't want to ship.
  • Code inside that user's type can see the private type we're emitting, and EditorBrowsable doesn't currently help (EditorBrowsable is useless for source generators to hide stuff roslyn#60080).

With this PR, we instead move all of the types into a separate namespace and parent class, which then enables a) having usings inside the namespace declaration so that we can avoid fully-qualifying everything and b) separating out all of the helper methods into a class that's then private to the parent wrapper class, shielding it from the rest of the assembly.

Hopefully C# 11 will see a solution to dotnet/csharplang#5529, at which point we can use that to hide all of the separated out generated code other than the partial method definitions (with the changes in this PR, doing so should involve minimal tweaks to the code generation). Until then, we still have a similar problem as we do today for the generated Regex-derived types being visible to other code in the assembly, but at least the shared helpers aren't.

cc: @joperezr, @chsienki

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions

Milestone: 7.0.0

@ghost ghost assigned stephentoub Mar 10, 2022
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a new pragma warnings in the headers though? I believe it is common for people to add stylecop rules that require using statements to be outside of a namespace.

Copy link
Member Author

@stephentoub stephentoub Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such rules by default shouldn't apply to generated code. But even if they did, I don't want us to add pragmas for warnings that aren't in the .NET SDK; otherwise we open the doors to an unending laundry list of arbitrary 3rd-party analyzers we'd need to start considering.

Note there is a hidden diagnostic issued for this by the C# compiler (CS8019), but pragmas appear to be ineffective at disabling it. @sharwell may know why.

Copy link
Contributor

@sharwell sharwell Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CS8019

This is a hidden diagnostic used for fading out unnecessary usings in the IDE. It is intentionally reported even in generated files, and will not impact builds. IDE0005 is the complementary rule that is only reported in non-generated files, and might be raised to warning/error in builds.

Do we need a new pragma warnings in the headers though?

As long as you use a hint name ending in .g.cs or have // <auto-generated/> as the first line of the file, it will be treated as generated code and code style analyzers will not run in the file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

It is intentionally reported even in generated files

I was surprised that they were still issued even with:

#pragma warning disable CS8019

at the top of the file. Is that expected?

As long as you use a hint name ending in .g.cs or have // as the first line of the file

We do both.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Smal NIT here, should we take the namespace and type declaration out of this method? My question is basically that if you have one partial type which has two source generated methods, it would be nice if the generated code only has one namespace and type declaration which contains both of the generated methods that fetch the instance.

Copy link
Member Author

@stephentoub stephentoub Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds reasonable but will require some additional changes, and this was preexisting these changes. We can do it subsequently. Though it will add additional expense to figure out all the groupings... I'm not sure if it's really worth it to avoid the extra partial definitions.

Copy link
Member

@joperezr joperezr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments but this looks great, thanks @stephentoub

@stephentoub stephentoub merged commit 3fc949b into dotnet:main Mar 10, 2022
@stephentoub stephentoub deleted the sourcegenstrategy branch March 10, 2022 23:42
@ghost ghost locked as resolved and limited conversation to collaborators Apr 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avoid global:: qualifying all names in the Regex source generator Move Regex source generator generated helper methods to a shared class

5 participants