- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Improve debug symbol names to avoid ambiguity and work better with MSVC's debugger #85269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Some changes occured to the CTFE / Miri engine cc @rust-lang/miri | 
| Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @matthewjasper (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. | 
        
          
                src/test/debuginfo/function-names.rs
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wesleywiser generator functions are still not appearing, even with #84822
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| Many of these bits sound very much like the v0 mangling (currently not on by default). I'm wondering if we should rather make sure that works well for this use case and invest in it, rather than creating yet another variant... | 
| 
 v0 mangling may help on Linux, but on Windows the debugger uses the names stored in the debug info rather than demangling the names, so I'd still like to make the changes to improve that experience. | 
| That makes sense - definitely we should improve the experience - but I'm wondering if it also makes sense to try and arrive at a consistent format, for example making sure the v0 format (perhaps when targeting msvc, ideally always) uses the replacements and suggestions you've laid out here, and then generate the debug names using it on windows. Cc @nagisa @eddyb (not sure who else might be involved in our debuginfo / symbol names) | 
| cc me | 
| @Mark-Simulacrum I think the v0 mangling scheme is only marginally related to debuginfo. DWARF (and also CodeView, I think) store a plain string version of type and function names. Can you explain in more detail what you think about the connection of v0 and debuginfo? | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to have so many new CDB tests, thanks @dpaoliello!
I think it would be good if all "synthetic" names (like array, tuple, etc) would use a consistent naming scheme. Right now some of them have two leading underscores (e.g. __impl) and some don't; and some of them contain the special character $ and some don't. Always having a special character that is not allowed in a Rust identifier might be a good way to avoid name clashes with regular types.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| 
 I'm happy to adjust the names if we can come up with a standard. MSVC's debugger permits identifiers to be  @michaelwoerister Thoughts? | 
| 
 This looks fine. Another option is to put them in a common "namespace":  | 
| 
 I'd prefer the simple names over namespaces to keep the symbol names short | 
| It looks to me like we have two options: 
 To me it looks like there are already many constructs (like  The main downside I see is that NATVIS does not seem to provide a way of "demangling" these type names in order to display something nicer and closer to Rust syntax. But given that the alternative would only be better in a few cases, I don't think that is an actual problem. The increased uniformity of a such an encoding might even make it easier to "manually" decode these types, even though they look quite foreign. I put together a table comparing the various encodings that should help us make further decisions (let me know if I got something wrong): 
 I'm curious to hear what you all think about this  | 
| 
 I wasn't aware we had the ability to store plain string versions -- if so, then that concern is moot for sure. In regards to the latest question/comment, I also tend to favor the proposal to go with non-Rust-like syntax if we can't match it closely anyway. That also gives us the ability to just have a table in the documentation (e.g., the rustc book) which tells people what things mean, and there's less worry about having "close but not really" meanings. This is basically what you already said :) | 
| I also think it makes more sense to go with the second option and encode type names in a C++ friendly way. | 
| 
 Looks good to me. For functions, MSVC produces the format  So for your example we'd have:  We can possibly drop the calling convention for native Rust functions, and for non-Windows platforms (e.g., Clang doesn't include the calling convention - https://godbolt.org/z/1rGerjf1a). | 
| 
 Function types are an interesting case. How well does natvis matching work types like  We could also try to encode them like the other types, as  | 
| 
 Using the same format as C++ should mean that NatVis, and other native debugging tools, will understand that it is a function pointer (and be able to extract the return type and params). The important thing is to make sure that the function signature is unique, not that is necessarily encodes all the information about the function: since a function can't be overloaded by safe vs unsafe, or by the calling convention (even in C/C++), then that information doesn't need to be in the debug symbol. Personally, I'd rather stick to the  | 
| 
 Yes, I agree, calling convention and unsafety aren't strictly necessary here. 
 Do those APIs go through the stringified type name rather than CodeView records? 
 I'm not opposed to going that route, especially if it has tangible advantages for tooling. However, if it does not make a difference for tooling (and we leave off calling convention and unsafety) then I think something like  | 
| 
 Unfortunately, using the  | 
a2dc033    to
    410e9dc      
    Compare
  
    | The test failed because the debuginfo type name for  @bors r=michaelwoerister | 
| 📌 Commit c1601dc has been approved by  | 
| ☀️ Test successful - checks-actions | 
| Thanks for your patience and all the hard work you put into this, @dpaoliello! | 
| This change led to moderate performance regressions in many debug build benchmarks which is unsurprising. As part of the performance triage process, I'm marking this as a performance regression. Given the existence of #86431, I will also mark this as having a justified performance regression as hopefully that issue will resolve the performance regressions introduced here. @rustbot label +perf-regression +perf-regression-triaged | 
…-type-names-fix, r=oli-obk,wesleywiser Handle non-integer const generic parameters in debuginfo type names. This PR fixes an ICE introduced by rust-lang#85269 which started emitting const generic arguments for debuginfo names but did not cover the case where such an argument could not be evaluated to a flat string of bits. The fix implemented in this PR is very basic: If `try_eval_bits()` fails for the constant in question, we fall back to generating a stable hash of the constant and emit that instead. This way we get a (virtually) unique name and side step the problem of generating a string representation of a potentially complex value. The downside is that the generated name will be rather opaque. E.g. the regression test adds a function `const_generic_fn_non_int<()>` which is then rendered as `const_generic_fn_non_int<{CONST#fe3cfa0214ac55c7}>`. I think it's an open question how to deal with this more gracefully. I'd be interested in ideas on how to do this better. r? `@wesleywiser` cc `@dpaoliello` (do you see any problems with this approach?) cc `@Mark-Simulacrum` & `@nagisa` (who I've seen comment on debuginfo issues recently -- anyone else?) Fixes rust-lang#86893
…-type-names-fix, r=oli-obk,wesleywiser Handle non-integer const generic parameters in debuginfo type names. This PR fixes an ICE introduced by rust-lang#85269 which started emitting const generic arguments for debuginfo names but did not cover the case where such an argument could not be evaluated to a flat string of bits. The fix implemented in this PR is very basic: If `try_eval_bits()` fails for the constant in question, we fall back to generating a stable hash of the constant and emit that instead. This way we get a (virtually) unique name and side step the problem of generating a string representation of a potentially complex value. The downside is that the generated name will be rather opaque. E.g. the regression test adds a function `const_generic_fn_non_int<()>` which is then rendered as `const_generic_fn_non_int<{CONST#fe3cfa0214ac55c7}>`. I think it's an open question how to deal with this more gracefully. I'd be interested in ideas on how to do this better. r? ``@wesleywiser`` cc ``@dpaoliello`` (do you see any problems with this approach?) cc ``@Mark-Simulacrum`` & ``@nagisa`` (who I've seen comment on debuginfo issues recently -- anyone else?) Fixes rust-lang#86893
…-type-names-fix, r=oli-obk,wesleywiser Handle non-integer const generic parameters in debuginfo type names. This PR fixes an ICE introduced by rust-lang#85269 which started emitting const generic arguments for debuginfo names but did not cover the case where such an argument could not be evaluated to a flat string of bits. The fix implemented in this PR is very basic: If `try_eval_bits()` fails for the constant in question, we fall back to generating a stable hash of the constant and emit that instead. This way we get a (virtually) unique name and side step the problem of generating a string representation of a potentially complex value. The downside is that the generated name will be rather opaque. E.g. the regression test adds a function `const_generic_fn_non_int<()>` which is then rendered as `const_generic_fn_non_int<{CONST#fe3cfa0214ac55c7}>`. I think it's an open question how to deal with this more gracefully. I'd be interested in ideas on how to do this better. r? ```@wesleywiser``` cc ```@dpaoliello``` (do you see any problems with this approach?) cc ```@Mark-Simulacrum``` & ```@nagisa``` (who I've seen comment on debuginfo issues recently -- anyone else?) Fixes rust-lang#86893
…ype-names-fix, r=oli-obk,wesleywiser Handle non-integer const generic parameters in debuginfo type names. This PR fixes an ICE introduced by rust-lang#85269 which started emitting const generic arguments for debuginfo names but did not cover the case where such an argument could not be evaluated to a flat string of bits. The fix implemented in this PR is very basic: If `try_eval_bits()` fails for the constant in question, we fall back to generating a stable hash of the constant and emit that instead. This way we get a (virtually) unique name and side step the problem of generating a string representation of a potentially complex value. The downside is that the generated name will be rather opaque. E.g. the regression test adds a function `const_generic_fn_non_int<()>` which is then rendered as `const_generic_fn_non_int<{CONST#fe3cfa0214ac55c7}>`. I think it's an open question how to deal with this more gracefully. I'd be interested in ideas on how to do this better. r? `@wesleywiser` cc `@dpaoliello` (do you see any problems with this approach?) cc `@Mark-Simulacrum` & `@nagisa` (who I've seen comment on debuginfo issues recently -- anyone else?) Fixes rust-lang#86893
Some checks are temporary disabled for MSVC LLDB. Pretty-printers for pointer types of string slices does not work since Rust 1.55 because of the changes in debug info generation introduced in rust-lang/rust#85269. Since 1.55, rustc generates `ptr_const$<...>` and `ptr_mut$<...>` type names instead of `const str *` and `mut str *` when targeting MSVC. So pretty-printer should be updated and the corresponding `lldbg-check`s should be added
Some checks are temporary disabled for MSVC LLDB. Pretty-printers for pointer types of string slices does not work since Rust 1.55 because of the changes in debug info generation introduced in rust-lang/rust#85269. Since 1.55, rustc generates `ptr_const$<...>` and `ptr_mut$<...>` type names instead of `const str *` and `mut str *` when targeting MSVC. So pretty-printer should be updated and the corresponding `lldbg-check`s should be added
There are several cases where names of types and functions in the debug info are either ambiguous, or not helpful, such as including ambiguous placeholders (e.g.,
{{impl}},{{closure}}ordyn _') or dropping qualifications (e.g., for dynamic types).Instead, each debug symbol name should be unique and useful:
DefPathDataName(closures and generators), and unify their formatting when used as a path-qualifier vs item being qualified.qualifiedargument when emitting ref and pointer types.Additionally, when targeting MSVC, its debugger treats many command arguments as C++ expressions, even when the argument is defined to be a symbol name. As such names in the debug info need to be more C++-like to be parsed correctly:
#,[,",+).<or{as this is treated as an operator.>>is always treated as a right-shift, even when parsing generic arguments (so add a space to avoid this).array$<type, size>type.$in all synthetic types as this is a legal character for C++, but not Rust (thus we avoid collisions with user types).