From d90c45c85a39fb29d3e6c8d6d455a02dd2e2b0fa Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Mon, 6 Jun 2016 14:46:22 -0400 Subject: [PATCH 1/4] deprecating CStr::as_ptr in favor of with_ptr --- text/0000-cstr-with-ptr.md | 126 +++++++++++++++++++++++++++++++++++++ 1 file changed, 126 insertions(+) create mode 100644 text/0000-cstr-with-ptr.md diff --git a/text/0000-cstr-with-ptr.md b/text/0000-cstr-with-ptr.md new file mode 100644 index 00000000000..8434079b77c --- /dev/null +++ b/text/0000-cstr-with-ptr.md @@ -0,0 +1,126 @@ +- Feature Name: `cstr_with_ptr` +- Start Date: 2016-06-06 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +Deprecate `CStr::as_ptr`, which returns a raw pointer and thereby invites unsound code, in favor of +a new method, `CStr::with_ptr`, which passes the pointer to a user-provided closure. + +# Motivation +[motivation]: #motivation + +`CString` wraps a C string and provides easy, safe access to the raw pointer to the string, for +FFI interop. The safety is correct, as there's nothing unsound about having a raw pointer around +until you dereference it. However, raw pointers are not tracked by the borrow checker, and once you +have one it's as easy in Rust as it is in C to keep it around too long (namely, after the backing +`CString` is dropped) and have a dangling pointer, which leads to unsound code. + +There are [hundreds of projects](https://users.rust-lang.org/t/you-should-stop-telling-people-that-safe-rust-is-always-safe/6094/7) +including a call like `CString::new(...).unwrap().as_ptr()`, which is evidence that this UB +mistake is widespread. + +By changing the API from a method _returning_ a raw pointer to one that runs a user-provided +closure, we make it easier to write sound code than to write unsound code. Sure, you can still +have the pointer escape the closure (see [Drawbacks](#drawbacks)), but it's a clear choice. The +API change tips the scales towards soundness. + +# Detailed design +[design]: #detailed-design + +1. Deprecate `CStr::as_ptr`, with the following message: "too easy to misuse because it returns a +raw pointer; use `with_ptr` instead". The method is stable, so it will not be removed in any 1.x +release. + +2. Add the following method to `CStr`: + + ```rust + /// Calls the provided closure, passing the inner pointer to this C string. + /// + /// The pointer will be valid for as long as `self` is and points to a contiguous region of + /// memory terminated with a 0 byte to represent the end of the string. + #[allow(deprecated)] // for the as_ptr call + fn with_ptr(&self, f: F) { + f(self.as_ptr()); + } + ``` + + For example usage see [this playpen](https://play.rust-lang.org/?gist=b6b1495ebee03fea679e95acb6b51ed6). + +3. Modify the `CStr` and `CString` examples that use `as_ptr` to use `with_ptr` instead. + + `CString::new` has the following example code: + + ```rust + use std::ffi::CString; + use std::os::raw::c_char; + + extern { fn puts(s: *const c_char); } + + fn main() { + let to_print = CString::new("Hello!").unwrap(); + unsafe { + puts(to_print.as_ptr()); + } + } + ``` + + Under this proposal, it would change to: + + ```rust + use std::ffi::CString; + use std::os::raw::c_char; + + extern { fn puts(s: *const c_char); } + + fn main() { + let to_print = CString::new("Hello!").unwrap(); + to_print.with_ptr(|p| unsafe { + puts(p); + }); + } + ``` + + There are nearly identical examples on `CString` and `CStr` themselves which would change in the + same way. + +# Drawbacks +[drawbacks]: #drawbacks + +- It deprecates another stable method, which contributes to perception of API churn. +- It adds surface area to the API of `CStr`. +- It's still rather easy to circumvent the help and write unsound code by "leaking" the pointer out +of the closure: + + ```rust + let mut ptr = ptr::null(); + { + let s = CString::new("foo").unwrap(); + s.with_ptr(|p| { ptr = p; }); + } + /* ptr is now dangling */ + ``` + +# Alternatives +[alternatives]: #alternatives + +- Do nothing. It remains very easy to write unsound code using `CString` and `CStr::as_ptr`. +- Move the `temporary_cstring_as_ptr` lint, which warns on the most common way to write said unsound +code (calling `as_ptr` on a temporary `CString`) from Clippy to rustc. + +# Unresolved questions +[unresolved]: #unresolved-questions + +- Should `with_ptr` allow the closure to return a value? It could be + + ```rust + fn with_ptr O, O>(&self, f: F) -> O { + f(self.as_ptr()) + } + ``` + + which might be convenient but would make it easier to "leak" the pointer (as easy as + `let ptr = s.with_ptr(|p| p);`). + From 6b51f5c3d02f1c0c1494fa29632609745f052cef Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Mon, 6 Jun 2016 17:40:45 -0400 Subject: [PATCH 2/4] add `unsafe fn` alternative --- text/0000-cstr-with-ptr.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-cstr-with-ptr.md b/text/0000-cstr-with-ptr.md index 8434079b77c..eeeb58b30e2 100644 --- a/text/0000-cstr-with-ptr.md +++ b/text/0000-cstr-with-ptr.md @@ -109,6 +109,8 @@ of the closure: - Do nothing. It remains very easy to write unsound code using `CString` and `CStr::as_ptr`. - Move the `temporary_cstring_as_ptr` lint, which warns on the most common way to write said unsound code (calling `as_ptr` on a temporary `CString`) from Clippy to rustc. +- Deprecate `as_ptr` and introduce a new function that does the same thing (we would have to bikeshed +the name), but mark it `unsafe` due to the potential for unsoundness in combination with unsafe code. # Unresolved questions [unresolved]: #unresolved-questions From f3be1f17e8566b538fdb7d1fd17af92d45027c5b Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Thu, 9 Jun 2016 12:06:03 -0400 Subject: [PATCH 3/4] add documentation alternatives --- text/0000-cstr-with-ptr.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0000-cstr-with-ptr.md b/text/0000-cstr-with-ptr.md index eeeb58b30e2..12bc09d13b7 100644 --- a/text/0000-cstr-with-ptr.md +++ b/text/0000-cstr-with-ptr.md @@ -107,6 +107,7 @@ of the closure: [alternatives]: #alternatives - Do nothing. It remains very easy to write unsound code using `CString` and `CStr::as_ptr`. +- Just make the warnings louder and the font larger in the docs for `CStr::as_ptr`. Continue to assume that people read this documentation despite the fact that the documentation examples show correct usage, but incorrect usage abounds in the ecosystem. - Move the `temporary_cstring_as_ptr` lint, which warns on the most common way to write said unsound code (calling `as_ptr` on a temporary `CString`) from Clippy to rustc. - Deprecate `as_ptr` and introduce a new function that does the same thing (we would have to bikeshed From 84043ca9aafc13fdf1defd57b3ef003ad2d4ae6d Mon Sep 17 00:00:00 2001 From: Alex Burka Date: Thu, 9 Jun 2016 12:08:54 -0400 Subject: [PATCH 4/4] add unresolved question about temporaries --- text/0000-cstr-with-ptr.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-cstr-with-ptr.md b/text/0000-cstr-with-ptr.md index 12bc09d13b7..7c5971d9d4d 100644 --- a/text/0000-cstr-with-ptr.md +++ b/text/0000-cstr-with-ptr.md @@ -127,3 +127,5 @@ the name), but mark it `unsafe` due to the potential for unsoundness in combinat which might be convenient but would make it easier to "leak" the pointer (as easy as `let ptr = s.with_ptr(|p| p);`). +- Does `f(CString::new(...).unwrap().as_ptr())` actually invoke undefined behavior, if `f` doesn't store the pointer? The author's reading of the Rust reference implies that the `CString` temporary is kept alive for the entire expression, so it's fine. However, some commenters in the RFC thread have opined that the behavior of this code is unspecified at best. +