- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Cleanup markdown span handling, take 2 #80550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Joshua Nelson <[email protected]>
Co-authored-by: Joshua Nelson <[email protected]>
Co-authored-by: Joshua Nelson <[email protected]>
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| r? @jyn514 (rust-highfive has picked a reviewer for you, use r? to override) | 
| 
 I think we should consider asking pulldown if they're willing to do this; I think it would be useful and in scope. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decide if locate() can be simplified by assuming s is always in md
I would expect this to cause unsoundness if wrong, so I'd prefer not to do that. I don't think current pulldown API guarentees it will be borrowed from the original string, it could be borrowed from Box::leak or something.
Overall this is great though ❤️
        
          
                src/librustdoc/html/markdown.rs
              
                Outdated
          
        
      There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why only Borrowed variants have the wrong span? I would expect instead for it to be wrong for all spans, but we'd only be able to use locate on Borrowed variants because otherwise it points to a different allocation.
That brings up a couple other questions I have:
- Why does locateonly work for things within the same allocation? Can we improve it to work regardless?
- When would pulldown give us a BoxedorInlinedspan?
I think this PR would be good to land regardless, but the comment seems off, we should at least fix that before landing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get the easy ones out of the way first:
- 
The comment tries to hint at the current problem: we get a Borrowedstring that contains the destination of a reference-style link, and a range that points at the link usage itself.
- 
I'm not sure about Inlined, but it seems like it's a small string optimization for theBoxedvariant. It exists to avoid allocating oncloneif the boxed string is short enough.
As far as I can see, Pulldown provides a Borrowed if the string directly comes from the source. If any transformation is necessary (e.g. unescaping used when scanning ref defs or such), Pulldown builds and hands out a Boxed string.
The issue is not that Borrowed strings can have an incorrect span, it's that for those, we can actually restore the span we need. For any other variant, it's reasonable (I think) to assume that the string contained in the variant is some sort of derivative of the original source, and can not be located by either pointer arithmetic or string search. (Just as I write this, if I'm correct this makes rfind-ing on the Boxed variant kinda pointless, because the string will not be an exact match anyway. Otherwise, Pulldown would have handed out a Borrowed).
Hope this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is not that Borrowed strings can have an incorrect span, it's that for those, we can actually restore the span we need.
Right, that's what I expected. Can you say that in the comment? Right now it makes it sound like the range will only be different for Borrowed, which isn't true.
Can you take a look at 1 to see if it's possible to get rid of the unsafe? (and hopefully make this work for Boxed variants too, but after your later comments that might not be possible.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely certain what you're referring to as "1" (ah I realized you meant your own numbered point). Getting rid of the unsafe is as easy as casting the pointer to usize but that's just masking the problem. Otherwise, I would not change locate if there's a chance for a pulldown update that allows us to remove it entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, r=me with the comment fixed then, and hopefully we can get rid of locate in a pulldown update soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found where my expectation break down: LinkReplacer can give us Borrow variants that borrow from something other than md. I made a note of this.
| @bors delegate=bugadani | 
| ✌️ @bugadani can now approve this pull request | 
9cbac74    to
    b67e8c1      
    Compare
  
    b67e8c1    to
    f073543      
    Compare
  
    | @bors r=jyn514 | 
| 📌 Commit f073543 has been approved by  | 
| ☀️ Test successful - checks-actions | 
This PR includes the cleanups made in #80244 except for the removal of
locate().While the biggest conceptual part in #80244 was the removal of
locate(), it introduced a diagnostic regression.Additional cleanup:
RefCellto avoid building two separate vectors for the linksWork to do:
locate()can be simplified by assumingsis always inmdcc @jyn514 This is the best I can do without patching Pulldown to provide multiple ranges for reference-style links. Also, since
locateis probably more efficient thanrfind(at least it's constant time), I decided to not check the link type and just cover every &str as it was before.