- 
                Notifications
    You must be signed in to change notification settings 
- Fork 399
Work with MIR-libstd #171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Work with MIR-libstd #171
Conversation
| Interesting... How is travis still passing? | 
| I assume travis doesn't execute the xargo stuff, nor does it even hit the other code paths I changed? | 
        
          
                README.md
              
                Outdated
          
        
      |  | ||
| Notice that you will have to re-run the last step of the preparations above when | ||
| your toolchain changes (e.g., when you update the nightly). Also, xargo doesn't | ||
| currently work with nightlies newer than 2017-04-23. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you get this working then? miri won't compile on such an old nightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the old miri from back then.
I'm currently trying to fix the creation of the rust-src distribution component such that xargo works again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<3 awesome
| 
 Oh right. You didn't actually remove the "dead" parts. That's cool that it'll work on both systems for now. @eddby I see no reason not to merge this, even if it's not very convenient to use right now. It'll continue working as intended, but make future PRs easier since there won't be any merge issues with this PR. Or do you see this differently? | 
| I assume you meant @eddyb ;) | 
| I submitted rust-lang/rust#42214 to rustc to fix using xargo against the latest nightlies. | 
        
          
                src/terminator/mod.rs
              
                Outdated
          
        
      | // Intercept some methods (even if we can find MIR for them) | ||
| if let ty::InstanceDef::Item(def_id) = instance.def { | ||
| match &self.tcx.item_path_str(def_id)[..] { | ||
| "std::sys::imp::fast_thread_local::register_dtor" => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do it through the path instead of intercepting the actual FFI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no FFI here. The trouble is that std::sys::imp::fast_thread_local::register_dtor accesses a global static variable with "extern weak" linkage. This is passed to ConstantExtractor::global_item, where it fails to find a MIR for this global. I tried to make it so that miri pretends this symbol is always NULL, but could not figure out how to make that work. Isn't global_item for looking up global variables? Why does that end up pushing a stack frame?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it evaluates the static the first time it is accessed. There is no miri support for weak linkage, in fact I'd be surprised if many people knew it turns the static into a pointer behind the scenes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it evaluates the static the first time it is accessed.
Ah, I see, so the self.ecx.globals stores the already initialized globals. global_item just has the side-effect of making sure the global is in that table?
There is no miri support for weak linkage
There's not much miri can do, is there? Just like with C abi calls, I guess.
it turns the static into a pointer behind the scenes.
"it" = rustc or miri?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I was misremembering, this is easier than I thought (it also proves that the semantics are tricky)! Turns out the static has to contain a pointer which can be observed as null (unless overridden). So you were on the right track, just need to bypass static initializer evaluation.
| All right, I added thread-local storage to the memory subsystem and hooked the pthread calls into it. This is enough to make  Just like in my original PR, when the program is done, miri complains about memory leaks.                 let dtor_ty = self.operand_ty(&arg_operands[1]);
                let dtor = self.value_to_primval(args[1], dtor_ty)?;
                trace!("TLS dtor: {:?}", dtor);which fails with "primitive read failed for type: std::option::Option<unsafe extern "C" fn(*mut libc::c_void)>". | 
| I implemented support for destructors. Now things fail with Also, there are some FIXME in the latest commit that should be fixed before this gets merged. | 
| I'm pretty sure miri gives up on calling C ABI functions defined in Rust right now - it should be easy to enable as long as the call ABI and the definition ABI are the same (compatibility across ABIs would require an interpretation of those ABIs). | 
|  | ||
| pub fn is_null_ptr(&self) -> bool { | ||
| // FIXME: Is this the right way? | ||
| return *self == Pointer::from_int(0) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct. It's basically what happens when miri interprets the Mir of is_null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd do a match so that it works with Bytes(0) as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Pointer, not a Primval ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, no match then? All right, I'll remove this FIXME.
        
          
                src/step.rs
              
                Outdated
          
        
      | trace!("Running TLS dtor {:?} on {:?}", instance, ptr); | ||
| // TODO: Potientiually, this has to support all the other possible instances? See eval_fn_call in terminator/mod.rs | ||
| let mir = self.load_mir(instance.def)?; | ||
| // FIXME: Are these the right dummy values? | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could let the span be the span of the global, other than that it's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there is no need for any values to be dummy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I kind of considered the return pointer also a dummy value (as the function should not return anything), and I have no idea what the StackPopCleanup thing does, so I referred to them as dummys as well.
Are you saying that I should get the span from the instance, and then nothing is actually a dummy value? Not sure how I would get that span though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mir.span should suffice for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks :) I've been digging in rustc to find a way to go from a DefId to a Span, and found nothing. This works :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def_id -> span is tcx.def_span(def_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good to know, thanks :)
        
          
                src/terminator/mod.rs
              
                Outdated
          
        
      |  | ||
| // Extract the function type out of the signature (that seems easier than constructing it ourselves...) | ||
| // FIXME: Or should we instead construct the type we expect it to have? | ||
| let dtor_fn_ty = match self.operand_ty(&arg_operands[1]) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: match on .sty to reduce boilerplate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
What about the FIXME? Right now I am extracting the type from the fn that was called. That's slightly odd though: The type is Option<fn ...>, but if I try to load from memory at that type, it fails. I have to load at the fn type. I suppose this is related to the representation optimization on Option<Pointer> -- I thought that would be done "below" MIR level, but it seems like that's not the case and MIR can already observe this optimization?
Anyway, so right now I am extracting the fn out of the substitution attached to the Option. That doesn't seem very nice, but mk_fn_ptr looks rather ugly to call ('d have to somehow create a PolyFnSig).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, we do similar hackery in other locations. This is a c-function, so we don't have any "compile-time" type info available. You can read an Option<Pointer>, but it would require quite some boilerplate.
        
          
                src/terminator/mod.rs
              
                Outdated
          
        
      |  | ||
| // Figure out how large a pthread TLS key actually is. This is libc::pthread_key_t. | ||
| let key_size = match self.operand_ty(&arg_operands[0]) { | ||
| &TyS { sty: TypeVariants::TyRawPtr(TypeAndMut { ty, .. }), .. } => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| As a matter of general process, do you want me to just keep adding new commits to this PR (like it is handled in the rustc repo), or should I squash commits like the one I am writing now (applying your feedback) into previous commits to have a cleaner history? | 
| 
 Yea, that's great. No need to squash | 
| All right, fixed the things that came up. Next I will look at what goes wrong with this  | 
| Pushed things to make the call do  | 
| I changed the initialization process so that if there is a "start" lang item with MIR, that one is executed instead of directly running main. As a consequence of this, the  There are some open questions: 
 I would never have gotten so far without the patient support of @eddyb who kept answering my questions about weird internals of miri and/or the compiler -- thanks a lot :) | 
| With my latest commits, the entire test suite now passes with MIR-libstd :-) (of course, it also still works with the default, non-MIR-libstd) I will now stop adding new things to this PR, and instead await your review and your feedback regarding my open questions. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed some more spell errors.
        
          
                src/terminator/mod.rs
              
                Outdated
          
        
      | let f_instance = self.memory.get_fn(f.alloc_id)?; | ||
| self.write_primval(dest, PrimVal::Bytes(0), dest_ty)?; | ||
|  | ||
| // Now we make a functon call. TODO: Consider making this re-usable? EvalContext::step does sth. similar for the TLS dtors, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*function
        
          
                src/terminator/mod.rs
              
                Outdated
          
        
      | self.write_primval(dest, PrimVal::Bytes(0), dest_ty)?; | ||
|  | ||
| // Now we make a functon call. TODO: Consider making this re-usable? EvalContext::step does sth. similar for the TLS dtors, | ||
| // and of coruse eval_main. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*course
        
          
                src/bin/miri.rs
              
                Outdated
          
        
      | let entry_def_id = tcx.hir.local_def_id(entry_node_id); | ||
| miri::eval_main(tcx, entry_def_id, limits); | ||
| let start_wrapper = tcx.lang_items.start_fn() | ||
| .and_then(|start_fn| if tcx.is_mir_available(start_fn) { Some(start_fn) } else { None }); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be indented better, probably by putting the if on its own line.
        
          
                src/eval_context.rs
              
                Outdated
          
        
      | ecx.frame_mut().return_lvalue = Lvalue::from_ptr(ret_ptr); | ||
|  | ||
| loop { | ||
| if !ecx.step()? { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is essentially a while loop
        
          
                src/terminator/mod.rs
              
                Outdated
          
        
      | Ok(()) | ||
| } | ||
|  | ||
| /// Decides whether it is okay to call the method with signature `real_sig` using signature `sig` | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add FIXME to use a proper platform-specific ABI description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what exactly this means -- why should this be platform-dependent? Rust permits using non-capturing Fn as fn on all platforms, doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're doing things like argument compatibility (e.g. between all pointers) which can be done properly by getting the actual ABI descriptions (once they're lifted from trans).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, ideally we should also have platform-independent rules for this, or at least a set of rules guaranteed to be safe on all platforms. It should eventually be possible to use miri to run a piece of unsafe code and get a guarantee that the code behaves like this on all platforms (or, at least, on all platforms where the types have the same size and alignment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's almost literally impossible. The only thing we can do is track down all publicly documented ABIs, encode their rules, and apply the rules of all ABIs at once, comparing the result.
You could try to approximate a common core, but it only makes sense as the intersection of all supported ABIs. It's also a bit risky to specify due to forwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there should be some common core that Rust pretty much guarantees to be available... unsafe code that doesn't do low-level data representation stuff shouldn't really have to care about the platform, I feel.
But this is a whole new topic, so for now, I added your FIXME :)
        
          
                src/terminator/mod.rs
              
                Outdated
          
        
      | trace!("arg_operands: {:?}", arg_operands); | ||
| match sig.abi { | ||
| Abi::Rust => { | ||
| Abi::Rust | Abi::C => { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole match is actually about RustCall (the tuple hack) vs everything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do you want me to change it accordingly? Everything not mentioned is currently unimplemented!, so I left it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but they all behave the same way (as long as the call and definition have the same ABI), so there is no reason to leave all of those ABIs "unimplemented".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, fixed that.
        
          
                src/eval_context.rs
              
                Outdated
          
        
      | } | ||
|  | ||
| // Allocate memory for the return value. We have to do this when a stack frame was already pushed as the type code below | ||
| // calls EvalContext::substs, which needs a frame to be allocated (?!?) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use type_layout_with_substs to prevent the stack frame from being relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't have a subst, do I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. but just use an empty subst. That's what you're doing with the first frame anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
        
          
                src/eval_context.rs
              
                Outdated
          
        
      | } | ||
|  | ||
| // Allocate memory for the return value. We have to do this when a stack frame was already pushed as the type code below | ||
| // calls EvalContext::substs, which needs a frame to be allocated (?!?) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the 'with substs' variant of mehods to skip the frame lookup
…t C ABI calls if MIR is missing
…ing main directly This fixes the memory leaks when running a simple "Hello World" with MIR-libstd
… permitted Also properly check the "non-capturing Fn to fn" case
Also enable some tests that were disabled for no apparant reason. (The comment in zst.rs was wrong, the test was disabled also for miri execution.) Delete env_args test as the args can actually be queried with MIR-libstd (currently, they are always empty)
| Rebased branch pushed. Also, I added a test for thread-local key without dtor. | 
| I don't understand this travis failure. The same test is working perfectly fine here. | 
| I added the promised test for string formatting. I would like to use something more like the following: However, then the test suite fails with non-MIR libstd (failing to load MIR for  | 
| It only fails on I686-windows-gnu. Just ignore the test for that platform for now | 
| 
 Okay, will do. I think this leaves the error shown when  | 
| (Oh, I also changed the README reflecting that latest rust-src does have all the files it takes for xargo to build libstd with MIR. Right now we still need to fix some permissions, which is silly; I am working on fixing that in rustup. Also, a PR for miri which lets travis run the test suite against a MIR-libstd is in the works, but it obviously depends o this PR so I won't submit it before this one gets merged.) | 
| Lgtm. I'm fine with the Travis failure for now. | 
| Cool, and thanks again to @eddyb for all his help :) | 
This makes
println!("String literal")mostly work when miri has a libstd with full MIR at its hands. It still complains about things not being deallocated when miri terminates; that may be related to thread-local dtors not running.println!("{}", foo)still doesn't work because that code does some crazy casts which are currently rejected by miri.Also I figured out how to use xargo to build a MIR-libstd, so I documented that. Now there's "just" some bugs to be fixed that currently break xargo's libstd support...