-
Notifications
You must be signed in to change notification settings - Fork 29
[Coding Guideline]: Ensure reads of union fields produce valid values #300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Coding Guideline]: Ensure reads of union fields produce valid values #300
Conversation
✅ Deploy Preview for scrc-coding-guidelines ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
9dc2bcd to
ea2dcf2
Compare
|
@PLeVasseur @felix91gr I did a final cleanup and this rule LGTM. Please review & approve |
iglesias
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of comments. Otherwise it was all clear. One is a question about a line I wasn't sure about and the other just a typo fix suggestion.
src/coding-guidelines/types-and-traits/gui_UnionFieldValidity.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_UnionFieldValidity.rst.inc
Outdated
Show resolved
Hide resolved
Alright, it goes into the queue. |
src/coding-guidelines/types-and-traits/gui_UnionFieldValidity.rst.inc
Outdated
Show resolved
Hide resolved
Hey @felix91gr -- did you take a look at this one yet? I'd like it to be reviewed before we queue it up (I'll try to make time to review some of @rcseacord's latest ones tomorrow, this included, if it's not been reviewed by you yet) |
@PLeVasseur ah, sorry! I forgot we have a Merge Queue as well. I meant it as my own task queue, my bad. I haven't reviewed it yet |
PLeVasseur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey thanks for pulling this together @manhatsu and @rcseacord
Given that adding the bibliography bit touches quite a few other things, I'd suggest backing that out of this PR if you'd like the guideline to be reviewed on its merits and merged sooner than later.
I've opened issue #306 to track the work towards the work for a bibliography.
src/coding-guidelines/types-and-traits/gui_UnionFieldValidity.rst.inc
Outdated
Show resolved
Hide resolved
0431525 to
696e633
Compare
|
There is a duplication of URL in these two documents: |
Yeah, you're right. I didn't have this implemented correctly. Duplicate URLs should be fine, what I wanted to have was for the same URL to force consistent referencing to be used. I've pushed #335 which should fix this. Please take a look at the following and rebase: |
added new guideline on unions
repaired bibliography
delete unused citations
fixing up bibliography
interim save
cleaned up
…rst.inc Co-authored-by: Fernando José <fernando.iglesiasg@gmail.com>
696e633 to
99551c2
Compare
|
I rebased this on |
PLeVasseur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat a procedural review to get this in-line with current best practices as outlined in CONTRIBUTING.md.
I haven't reviewed the contents of the guideline yet.
Please update and I can then take a look through content.
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
src/coding-guidelines/types-and-traits/gui_0cuTYG8RVYjg.rst.inc
Outdated
Show resolved
Hide resolved
|
@guidelines-bot /claim |
|
✅ @PLeVasseur has claimed this review (previously: @PLeVasseur, @felix91gr). |
PLeVasseur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, have now reviewed the contents.
Overall this looks really good.
Please take a look at the comments and suggestions I left.
| .. rust-example:: | ||
| union IntOrBool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this? The goal is to show this in a situation / light that mimics code we might see used.
- We make the union
#[repr(C)]as most of the time we'll be doing this to interact over FFI - We put
// SAFETY:comments above usages of unsafe; still kept in there that we're therefore compliant - We use
assert_eq!()instead ofprintln!()so that we can test for equivalence
#[repr(C)]
#[derive(Copy, Clone)]
union IntOrBoolData {
i: u8,
b: bool,
}
/// Tracks which field of the union is currently active.
#[derive(Clone, Copy, PartialEq, Eq)]
enum ActiveField {
Int,
Bool,
}
/// A union wrapper that tracks the active field at runtime.
pub struct IntOrBool {
data: IntOrBoolData,
active: ActiveField,
}
impl IntOrBool {
pub fn from_int(value: u8) -> Self {
Self {
data: IntOrBoolData { i: value },
active: ActiveField::Int,
}
}
pub fn from_bool(value: bool) -> Self {
Self {
data: IntOrBoolData { b: value },
active: ActiveField::Bool,
}
}
pub fn set_int(&mut self, value: u8) {
self.data.i = value;
self.active = ActiveField::Int;
}
pub fn set_bool(&mut self, value: bool) {
self.data.b = value;
self.active = ActiveField::Bool;
}
/// Returns the integer value if that field is active.
pub fn as_int(&self) -> Option<u8> {
match self.active {
// SAFETY: We only read `i` when we know it was last written as `i`, thus compliant
ActiveField::Int => Some(unsafe { self.data.i }),
ActiveField::Bool => None,
}
}
/// Returns the boolean value if that field is active.
pub fn as_bool(&self) -> Option<bool> {
match self.active {
// SAFETY: We only read `b` when we know it was last written as `b`, thus compliant
ActiveField::Bool => Some(unsafe { self.data.b }),
ActiveField::Int => None,
}
}
}
fn main() {
let mut value = IntOrBool::from_bool(true);
assert_eq!(value.as_bool(), Some(true));
assert_eq!(value.as_int(), None);
value.set_int(42);
assert_eq!(value.as_bool(), None);
assert_eq!(value.as_int(), Some(42));
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this program have to do something so it's not all optimized away? Maybe return value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it doesn't matter. It's hard to make this code do something meaningful. A good optimizer is just going to return a constant value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved by d70df8d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this program have to do something so it's not all optimized away?
Assertions are never optimized away, by design. So whatever this program asserts, it will always be tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, there are also debug assertions that are removed from the release build.
| .. rust-example:: | ||
| union IntOrBool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?
- We make the union
#[repr(C)]as most of the time we'll be doing this to interact over FFI - We put
// SAFETY:comments above usages of unsafe; still kept in there that we're therefore compliant - We use
assert_eq!()instead ofprintln!()so that we can test for equivalence
#[repr(C)]
union IntOrBool {
i: u8,
b: bool,
}
fn main() {
let u = IntOrBool { b: true };
// SAFETY: Read the same field that was written, thus compliant
assert_eq!(unsafe { u.b }, true); // compliant
}| .. rust-example:: | ||
| union IntBytes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?
- We make the union
#[repr(C)]as most of the time we'll be doing this to interact over FFI - We put
// SAFETY:comments above usages of unsafe; still kept in there that we're therefore compliant - We use
assert_eq!()instead ofprintln!()so that we can test for equivalence
#[repr(C)]
#[derive(Copy, Clone)]
union IntBytes {
i: u32,
bytes: [u8; 4],
}
fn main() {
let u = IntBytes { i: 0x12345678 };
// SAFETY: All bit patterns are valid for [u8; 4]
// Note: byte order depends on target endianness
assert_eq!(unsafe { u.bytes }, 0x12345678_u32.to_ne_bytes()); // compliant
let u2 = IntBytes {
bytes: [0x11, 0x22, 0x33, 0x44],
};
// SAFETY: All bit patterns are valid for 'u32'
assert_eq!(unsafe { u2.i }, u32::from_ne_bytes([0x11, 0x22, 0x33, 0x44])); // compliant
}| .. rust-example:: | ||
| union IntOrBool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?
- We make the union
#[repr(C)]as most of the time we'll be doing this to interact over FFI - We put
// SAFETY:comments above usages of unsafe; still kept in there that we're therefore compliant - We use
assert_eq!()instead ofprintln!()so that we can test for equivalence
#[repr(C)]
union IntOrBool {
i: u8,
b: bool,
}
fn try_read_bool(u: &IntOrBool) -> Option<bool> {
// SAFETY: Reading as `u8` is always valid because all bit patterns
// are valid for `u8`, regardless of which field was last written.
let raw = unsafe { u.i };
// Validate before interpreting as `bool` (only 0 and 1 are valid)
match raw {
0 => Some(false),
1 => Some(true),
_ => None,
}
}
fn main() {
let u1 = IntOrBool { i: 1 };
let u2 = IntOrBool { i: 3 };
assert_eq!(try_read_bool(&u1), Some(true));
assert_eq!(try_read_bool(&u2), None);
}| println!("u1 as bool: {:?}", try_read_bool(&u1)); // Some(true) | ||
| println!("u2 as bool: {:?}", try_read_bool(&u2)); // None | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example might be a bit long, but tries to show that it's possible to:
- have FFI-facing code which is inherently unsafe
- have Rust codebase-facing safe versions which disallow incorrect typed reads
- enforcing this at compile-time is possible
I'm open to cutting some of this down. I think the 3rd point is the most important, if we wanted to cut it back to showcasing only that.
| .. compliant_example:: | |
| :id: compl_ex_4Z8tmqYLLjtw | |
| :status: draft | |
| Complex example showing: | |
| * use of compile-time check for valid type using generics | |
| * way to fence between FFI-facing code and rest of safe Rust codebase | |
| .. rust-example:: | |
| use std::marker::PhantomData; | |
| use std::mem::size_of; | |
| /// Marker types representing the active field. | |
| pub struct AsInt; | |
| pub struct AsBool; | |
| /// A union type which can be used to interact across FFI boundary. | |
| #[repr(C)] | |
| #[derive(Copy, Clone)] | |
| pub union IntOrBoolData { | |
| pub i: u8, | |
| pub b: bool, | |
| } | |
| /// Tag sent alongside the union from C code. | |
| #[repr(u8)] | |
| #[derive(Copy, Clone, PartialEq, Eq)] | |
| pub enum IntOrBoolTag { | |
| Int = 0, | |
| Bool = 1, | |
| } | |
| /// C-compatible tagged union as it might arrive from FFI. | |
| #[repr(C)] | |
| #[derive(Copy, Clone)] | |
| pub struct CIntOrBool { | |
| pub tag: IntOrBoolTag, | |
| pub data: IntOrBoolData, | |
| } | |
| // ============================================================================ | |
| // Safe wrapper types for use in the rest of the Rust codebase | |
| // ============================================================================ | |
| /// A union wrapper where the type parameter statically tracks the active field. | |
| /// This is zero-cost: same size as the raw union. | |
| #[repr(C)] | |
| pub struct IntOrBool<T> { | |
| data: IntOrBoolData, | |
| _marker: PhantomData<T>, | |
| } | |
| impl IntOrBool<AsInt> { | |
| pub fn from_int(value: u8) -> Self { | |
| Self { | |
| data: IntOrBoolData { i: value }, | |
| _marker: PhantomData, | |
| } | |
| } | |
| pub fn get(&self) -> u8 { | |
| // SAFETY: Type parameter `AsInt` guarantees the integer field is active | |
| unsafe { self.data.i } | |
| } | |
| /// Convert to boolean representation. | |
| /// Only valid when the integer value is 0 or 1. | |
| pub fn try_into_bool(self) -> Option<IntOrBool<AsBool>> { | |
| match self.get() { | |
| 0 | 1 => Some(IntOrBool { | |
| data: IntOrBoolData { b: self.get() == 1 }, | |
| _marker: PhantomData, | |
| }), | |
| _ => None, | |
| } | |
| } | |
| } | |
| impl IntOrBool<AsBool> { | |
| pub fn from_bool(value: bool) -> Self { | |
| Self { | |
| data: IntOrBoolData { b: value }, | |
| _marker: PhantomData, | |
| } | |
| } | |
| pub fn get(&self) -> bool { | |
| // SAFETY: Type parameter `AsBool` guarantees the boolean field is active | |
| unsafe { self.data.b } | |
| } | |
| /// Convert to integer representation. Always valid since bool is a subset of u8. | |
| pub fn into_int(self) -> IntOrBool<AsInt> { | |
| IntOrBool { | |
| data: self.data, | |
| _marker: PhantomData, | |
| } | |
| } | |
| } | |
| // ============================================================================ | |
| // FFI boundary: convert from C representation to safe Rust types | |
| // ============================================================================ | |
| /// Result of converting a C tagged union to a safe Rust type. | |
| /// The caller must handle both variants, ensuring type safety. | |
| pub enum SafeIntOrBool { | |
| Int(IntOrBool<AsInt>), | |
| Bool(IntOrBool<AsBool>), | |
| } | |
| impl CIntOrBool { | |
| /// Convert from C representation to safe Rust type at the FFI boundary. | |
| /// After this point, all code uses the type-safe wrappers. | |
| pub fn into_safe(self) -> SafeIntOrBool { | |
| match self.tag { | |
| IntOrBoolTag::Int => { | |
| // SAFETY: Tag guarantees integer field is active | |
| let value = unsafe { self.data.i }; | |
| SafeIntOrBool::Int(IntOrBool::from_int(value)) | |
| } | |
| IntOrBoolTag::Bool => { | |
| // SAFETY: Tag guarantees boolean field is active | |
| let value = unsafe { self.data.b }; | |
| SafeIntOrBool::Bool(IntOrBool::from_bool(value)) | |
| } | |
| } | |
| } | |
| } | |
| // ============================================================================ | |
| // FFI boundary: convert from safe Rust types back to C representation | |
| // ============================================================================ | |
| impl From<IntOrBool<AsInt>> for CIntOrBool { | |
| fn from(val: IntOrBool<AsInt>) -> Self { | |
| CIntOrBool { | |
| tag: IntOrBoolTag::Int, | |
| data: IntOrBoolData { i: val.get() }, | |
| } | |
| } | |
| } | |
| impl From<IntOrBool<AsBool>> for CIntOrBool { | |
| fn from(val: IntOrBool<AsBool>) -> Self { | |
| CIntOrBool { | |
| tag: IntOrBoolTag::Bool, | |
| data: IntOrBoolData { b: val.get() }, | |
| } | |
| } | |
| } | |
| // ============================================================================ | |
| // Example: application code that uses the safe types | |
| // ============================================================================ | |
| /// Process a boolean value. This function can ONLY receive IntOrBool<AsBool>, | |
| /// so there's no possibility of reading invalid bool bytes. | |
| fn process_bool(val: IntOrBool<AsBool>) -> &'static str { | |
| if val.get() { "yes" } else { "no" } | |
| } | |
| /// Process an integer value. | |
| fn process_int(val: IntOrBool<AsInt>) -> u8 { | |
| val.get().saturating_mul(2) | |
| } | |
| // Simulated FFI functions that would normally be defined in C. | |
| // In real code, these would be `extern "C"` declarations linked to a C library. | |
| /// Simulated C function that "receives" data from C. | |
| extern "C" fn receive_from_ffi() -> CIntOrBool { | |
| CIntOrBool { | |
| tag: IntOrBoolTag::Bool, | |
| data: IntOrBoolData { b: true }, | |
| } | |
| } | |
| /// Simulated C function that "sends" data to C. | |
| extern "C" fn send_to_ffi(data: CIntOrBool) { | |
| // In real code, this would be implemented in C | |
| match data.tag { | |
| IntOrBoolTag::Int => { | |
| let i = unsafe { data.data.i }; | |
| assert_eq!(i, 84); | |
| } | |
| IntOrBoolTag::Bool => { | |
| let b = unsafe { data.data.b }; | |
| assert!(b); | |
| } | |
| } | |
| } | |
| fn main() { | |
| // Prove zero-cost: PhantomData adds no size | |
| assert_eq!(size_of::<IntOrBoolData>(), size_of::<IntOrBool<AsInt>>()); | |
| assert_eq!(size_of::<IntOrBoolData>(), size_of::<IntOrBool<AsBool>>()); | |
| assert_eq!(size_of::<IntOrBoolData>(), 1); // Just one byte | |
| // === FFI boundary: receive from C === | |
| let from_c = receive_from_ffi(); | |
| let safe_value = from_c.into_safe(); | |
| // === Application code: fully type-safe, no unsafe === | |
| match safe_value { | |
| SafeIntOrBool::Bool(b) => { | |
| // Can only call process_bool with IntOrBool<AsBool> | |
| assert_eq!(process_bool(b), "yes"); | |
| } | |
| SafeIntOrBool::Int(i) => { | |
| // Can only call process_int with IntOrBool<AsInt> | |
| let _ = process_int(i); | |
| } | |
| } | |
| // === Type-safe conversions within Rust === | |
| let int_val = IntOrBool::from_int(1); | |
| // Cannot pass IntOrBool<AsInt> to process_bool - won't compile: | |
| // process_bool(int_val); // Error: expected IntOrBool<AsBool>, found IntOrBool<AsInt> | |
| // Must explicitly convert, which validates the value | |
| if let Some(bool_val) = int_val.try_into_bool() { | |
| assert_eq!(process_bool(bool_val), "yes"); | |
| } | |
| // Invalid conversion is caught at the conversion point | |
| let int_val = IntOrBool::from_int(42); | |
| assert!(int_val.try_into_bool().is_none()); // 42 is not a valid bool | |
| // === FFI boundary: send back to C === | |
| let int_val = IntOrBool::from_int(42); | |
| let doubled = IntOrBool::from_int(process_int(int_val)); | |
| send_to_ffi(doubled.into()); | |
| } |
| * last written through that field, or | ||
| * written through a field whose bytes are valid when reinterpreted as the target field's type | ||
|
|
||
| If the active field is uncertain, use explicit validity checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make another required rule which enforces that explicit validity checks are performed before any reads if we don't force that here.
It's important to enforce that anyone writing code be encouraged to write a deviation report.
updated example based on comment from @PLeVasseur
Added new guideline on unions by @rcseacord.
This is the same content as #270 but moved to the feature branch.