Skip to content

Conversation

@manhatsu
Copy link
Contributor

Added new guideline on unions by @rcseacord.

This is the same content as #270 but moved to the feature branch.

@netlify
Copy link

netlify bot commented Dec 16, 2025

Deploy Preview for scrc-coding-guidelines ready!

Name Link
🔨 Latest commit d70df8d
🔍 Latest deploy log https://app.netlify.com/projects/scrc-coding-guidelines/deploys/694c185ca0136200082c6728
😎 Deploy Preview https://deploy-preview-300--scrc-coding-guidelines.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@rcseacord
Copy link
Collaborator

@PLeVasseur @felix91gr I did a final cleanup and this rule LGTM. Please review & approve

@rcseacord rcseacord self-assigned this Dec 16, 2025
Copy link
Contributor

@iglesias iglesias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments. Otherwise it was all clear. One is a question about a line I wasn't sure about and the other just a typo fix suggestion.

@felix91gr
Copy link
Collaborator

@PLeVasseur @felix91gr I did a final cleanup and this rule LGTM. Please review & approve

Alright, it goes into the queue.

@PLeVasseur
Copy link
Collaborator

PLeVasseur commented Dec 17, 2025

@PLeVasseur @felix91gr I did a final cleanup and this rule LGTM. Please review & approve

Alright, it goes into the queue.

Hey @felix91gr -- did you take a look at this one yet? I'd like it to be reviewed before we queue it up

(I'll try to make time to review some of @rcseacord's latest ones tomorrow, this included, if it's not been reviewed by you yet)

@felix91gr
Copy link
Collaborator

I'd like it to be reviewed before we queue it up

@PLeVasseur ah, sorry! I forgot we have a Merge Queue as well. I meant it as my own task queue, my bad. I haven't reviewed it yet

Copy link
Collaborator

@PLeVasseur PLeVasseur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey thanks for pulling this together @manhatsu and @rcseacord

Given that adding the bibliography bit touches quite a few other things, I'd suggest backing that out of this PR if you'd like the guideline to be reviewed on its merits and merged sooner than later.

I've opened issue #306 to track the work towards the work for a bibliography.

@manhatsu manhatsu force-pushed the doc/union-field-validation branch from 0431525 to 696e633 Compare December 19, 2025 02:27
@manhatsu
Copy link
Contributor Author

There is a duplication of URL in these two documents:
expressions/gui_Bib7x9KmPq2nL.rst.inc and types-and-traits/gui_0cuTYG8RVYjg.rst.inc.
bibliography.md says I should consider this, but how can I actually solve this?
I think it's fine if both have the source entry...

@manhatsu manhatsu requested a review from PLeVasseur December 19, 2025 02:41
@PLeVasseur
Copy link
Collaborator

There is a duplication of URL in these two documents: expressions/gui_Bib7x9KmPq2nL.rst.inc and types-and-traits/gui_0cuTYG8RVYjg.rst.inc. bibliography.md says I should consider this, but how can I actually solve this? I think it's fine if both have the source entry...

Yeah, you're right. I didn't have this implemented correctly. Duplicate URLs should be fine, what I wanted to have was for the same URL to force consistent referencing to be used.

I've pushed #335 which should fix this.

Please take a look at the following and rebase:
https://github.com/rustfoundation/safety-critical-rust-coding-guidelines/blob/main/docs/bibliography.md

@PLeVasseur PLeVasseur force-pushed the doc/union-field-validation branch from 696e633 to 99551c2 Compare December 23, 2025 22:35
@PLeVasseur
Copy link
Collaborator

I rebased this on main. It's building and working now.

Copy link
Collaborator

@PLeVasseur PLeVasseur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat a procedural review to get this in-line with current best practices as outlined in CONTRIBUTING.md.

I haven't reviewed the contents of the guideline yet.

Please update and I can then take a look through content.

@manhatsu manhatsu requested a review from PLeVasseur December 24, 2025 00:16
@PLeVasseur
Copy link
Collaborator

@guidelines-bot /claim

@github-actions github-actions bot requested review from PLeVasseur and removed request for PLeVasseur and felix91gr December 24, 2025 14:07
@github-actions
Copy link
Contributor

@PLeVasseur has claimed this review (previously: @PLeVasseur, @felix91gr).

Copy link
Collaborator

@PLeVasseur PLeVasseur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, have now reviewed the contents.

Overall this looks really good.

Please take a look at the comments and suggestions I left.

.. rust-example::
union IntOrBool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? The goal is to show this in a situation / light that mimics code we might see used.

  1. We make the union #[repr(C)] as most of the time we'll be doing this to interact over FFI
  2. We put // SAFETY: comments above usages of unsafe; still kept in there that we're therefore compliant
  3. We use assert_eq!() instead of println!() so that we can test for equivalence
#[repr(C)]
#[derive(Copy, Clone)]
union IntOrBoolData {
    i: u8,
    b: bool,
}

/// Tracks which field of the union is currently active.
#[derive(Clone, Copy, PartialEq, Eq)]
enum ActiveField {
    Int,
    Bool,
}

/// A union wrapper that tracks the active field at runtime.
pub struct IntOrBool {
    data: IntOrBoolData,
    active: ActiveField,
}

impl IntOrBool {
    pub fn from_int(value: u8) -> Self {
        Self {
            data: IntOrBoolData { i: value },
            active: ActiveField::Int,
        }
    }

    pub fn from_bool(value: bool) -> Self {
        Self {
            data: IntOrBoolData { b: value },
            active: ActiveField::Bool,
        }
    }

    pub fn set_int(&mut self, value: u8) {
        self.data.i = value;
        self.active = ActiveField::Int;
    }

    pub fn set_bool(&mut self, value: bool) {
        self.data.b = value;
        self.active = ActiveField::Bool;
    }

    /// Returns the integer value if that field is active.
    pub fn as_int(&self) -> Option<u8> {
        match self.active {
            // SAFETY: We only read `i` when we know it was last written as `i`, thus compliant
            ActiveField::Int => Some(unsafe { self.data.i }),
            ActiveField::Bool => None,
        }
    }

    /// Returns the boolean value if that field is active.
    pub fn as_bool(&self) -> Option<bool> {
        match self.active {
            // SAFETY: We only read `b` when we know it was last written as `b`, thus compliant
            ActiveField::Bool => Some(unsafe { self.data.b }),
            ActiveField::Int => None,
        }
    }
}

fn main() {
    let mut value = IntOrBool::from_bool(true);
    assert_eq!(value.as_bool(), Some(true));
    assert_eq!(value.as_int(), None);

    value.set_int(42);
    assert_eq!(value.as_bool(), None);
    assert_eq!(value.as_int(), Some(42));
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this program have to do something so it's not all optimized away? Maybe return value?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it doesn't matter. It's hard to make this code do something meaningful. A good optimizer is just going to return a constant value.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved by d70df8d

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this program have to do something so it's not all optimized away?

Assertions are never optimized away, by design. So whatever this program asserts, it will always be tested.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, there are also debug assertions that are removed from the release build.

.. rust-example::
union IntOrBool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

  1. We make the union #[repr(C)] as most of the time we'll be doing this to interact over FFI
  2. We put // SAFETY: comments above usages of unsafe; still kept in there that we're therefore compliant
  3. We use assert_eq!() instead of println!() so that we can test for equivalence
#[repr(C)]
union IntOrBool {
    i: u8,
    b: bool,
}
fn main() {
    let u = IntOrBool { b: true };
    
    // SAFETY: Read the same field that was written, thus compliant
    assert_eq!(unsafe { u.b }, true); // compliant
}

.. rust-example::
union IntBytes {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

  1. We make the union #[repr(C)] as most of the time we'll be doing this to interact over FFI
  2. We put // SAFETY: comments above usages of unsafe; still kept in there that we're therefore compliant
  3. We use assert_eq!() instead of println!() so that we can test for equivalence
#[repr(C)]
#[derive(Copy, Clone)]
union IntBytes {
    i: u32,
    bytes: [u8; 4],
}

fn main() {
    let u = IntBytes { i: 0x12345678 };

    // SAFETY: All bit patterns are valid for [u8; 4]
    // Note: byte order depends on target endianness
    assert_eq!(unsafe { u.bytes }, 0x12345678_u32.to_ne_bytes()); // compliant

    let u2 = IntBytes {
        bytes: [0x11, 0x22, 0x33, 0x44],
    };

    // SAFETY: All bit patterns are valid for 'u32'
    assert_eq!(unsafe { u2.i }, u32::from_ne_bytes([0x11, 0x22, 0x33, 0x44])); // compliant
}

.. rust-example::
union IntOrBool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

  1. We make the union #[repr(C)] as most of the time we'll be doing this to interact over FFI
  2. We put // SAFETY: comments above usages of unsafe; still kept in there that we're therefore compliant
  3. We use assert_eq!() instead of println!() so that we can test for equivalence
#[repr(C)]
union IntOrBool {
    i: u8,
    b: bool,
}

fn try_read_bool(u: &IntOrBool) -> Option<bool> {
    // SAFETY: Reading as `u8` is always valid because all bit patterns
    // are valid for `u8`, regardless of which field was last written.
    let raw = unsafe { u.i };

    // Validate before interpreting as `bool` (only 0 and 1 are valid)
    match raw {
        0 => Some(false),
        1 => Some(true),
        _ => None,
    }
}

fn main() {
    let u1 = IntOrBool { i: 1 };
    let u2 = IntOrBool { i: 3 };

    assert_eq!(try_read_bool(&u1), Some(true));
    assert_eq!(try_read_bool(&u2), None);
}

println!("u1 as bool: {:?}", try_read_bool(&u1)); // Some(true)
println!("u2 as bool: {:?}", try_read_bool(&u2)); // None
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example might be a bit long, but tries to show that it's possible to:

  1. have FFI-facing code which is inherently unsafe
  2. have Rust codebase-facing safe versions which disallow incorrect typed reads
  3. enforcing this at compile-time is possible

I'm open to cutting some of this down. I think the 3rd point is the most important, if we wanted to cut it back to showcasing only that.

Suggested change
.. compliant_example::
:id: compl_ex_4Z8tmqYLLjtw
:status: draft
Complex example showing:
* use of compile-time check for valid type using generics
* way to fence between FFI-facing code and rest of safe Rust codebase
.. rust-example::
use std::marker::PhantomData;
use std::mem::size_of;
/// Marker types representing the active field.
pub struct AsInt;
pub struct AsBool;
/// A union type which can be used to interact across FFI boundary.
#[repr(C)]
#[derive(Copy, Clone)]
pub union IntOrBoolData {
pub i: u8,
pub b: bool,
}
/// Tag sent alongside the union from C code.
#[repr(u8)]
#[derive(Copy, Clone, PartialEq, Eq)]
pub enum IntOrBoolTag {
Int = 0,
Bool = 1,
}
/// C-compatible tagged union as it might arrive from FFI.
#[repr(C)]
#[derive(Copy, Clone)]
pub struct CIntOrBool {
pub tag: IntOrBoolTag,
pub data: IntOrBoolData,
}
// ============================================================================
// Safe wrapper types for use in the rest of the Rust codebase
// ============================================================================
/// A union wrapper where the type parameter statically tracks the active field.
/// This is zero-cost: same size as the raw union.
#[repr(C)]
pub struct IntOrBool<T> {
data: IntOrBoolData,
_marker: PhantomData<T>,
}
impl IntOrBool<AsInt> {
pub fn from_int(value: u8) -> Self {
Self {
data: IntOrBoolData { i: value },
_marker: PhantomData,
}
}
pub fn get(&self) -> u8 {
// SAFETY: Type parameter `AsInt` guarantees the integer field is active
unsafe { self.data.i }
}
/// Convert to boolean representation.
/// Only valid when the integer value is 0 or 1.
pub fn try_into_bool(self) -> Option<IntOrBool<AsBool>> {
match self.get() {
0 | 1 => Some(IntOrBool {
data: IntOrBoolData { b: self.get() == 1 },
_marker: PhantomData,
}),
_ => None,
}
}
}
impl IntOrBool<AsBool> {
pub fn from_bool(value: bool) -> Self {
Self {
data: IntOrBoolData { b: value },
_marker: PhantomData,
}
}
pub fn get(&self) -> bool {
// SAFETY: Type parameter `AsBool` guarantees the boolean field is active
unsafe { self.data.b }
}
/// Convert to integer representation. Always valid since bool is a subset of u8.
pub fn into_int(self) -> IntOrBool<AsInt> {
IntOrBool {
data: self.data,
_marker: PhantomData,
}
}
}
// ============================================================================
// FFI boundary: convert from C representation to safe Rust types
// ============================================================================
/// Result of converting a C tagged union to a safe Rust type.
/// The caller must handle both variants, ensuring type safety.
pub enum SafeIntOrBool {
Int(IntOrBool<AsInt>),
Bool(IntOrBool<AsBool>),
}
impl CIntOrBool {
/// Convert from C representation to safe Rust type at the FFI boundary.
/// After this point, all code uses the type-safe wrappers.
pub fn into_safe(self) -> SafeIntOrBool {
match self.tag {
IntOrBoolTag::Int => {
// SAFETY: Tag guarantees integer field is active
let value = unsafe { self.data.i };
SafeIntOrBool::Int(IntOrBool::from_int(value))
}
IntOrBoolTag::Bool => {
// SAFETY: Tag guarantees boolean field is active
let value = unsafe { self.data.b };
SafeIntOrBool::Bool(IntOrBool::from_bool(value))
}
}
}
}
// ============================================================================
// FFI boundary: convert from safe Rust types back to C representation
// ============================================================================
impl From<IntOrBool<AsInt>> for CIntOrBool {
fn from(val: IntOrBool<AsInt>) -> Self {
CIntOrBool {
tag: IntOrBoolTag::Int,
data: IntOrBoolData { i: val.get() },
}
}
}
impl From<IntOrBool<AsBool>> for CIntOrBool {
fn from(val: IntOrBool<AsBool>) -> Self {
CIntOrBool {
tag: IntOrBoolTag::Bool,
data: IntOrBoolData { b: val.get() },
}
}
}
// ============================================================================
// Example: application code that uses the safe types
// ============================================================================
/// Process a boolean value. This function can ONLY receive IntOrBool<AsBool>,
/// so there's no possibility of reading invalid bool bytes.
fn process_bool(val: IntOrBool<AsBool>) -> &'static str {
if val.get() { "yes" } else { "no" }
}
/// Process an integer value.
fn process_int(val: IntOrBool<AsInt>) -> u8 {
val.get().saturating_mul(2)
}
// Simulated FFI functions that would normally be defined in C.
// In real code, these would be `extern "C"` declarations linked to a C library.
/// Simulated C function that "receives" data from C.
extern "C" fn receive_from_ffi() -> CIntOrBool {
CIntOrBool {
tag: IntOrBoolTag::Bool,
data: IntOrBoolData { b: true },
}
}
/// Simulated C function that "sends" data to C.
extern "C" fn send_to_ffi(data: CIntOrBool) {
// In real code, this would be implemented in C
match data.tag {
IntOrBoolTag::Int => {
let i = unsafe { data.data.i };
assert_eq!(i, 84);
}
IntOrBoolTag::Bool => {
let b = unsafe { data.data.b };
assert!(b);
}
}
}
fn main() {
// Prove zero-cost: PhantomData adds no size
assert_eq!(size_of::<IntOrBoolData>(), size_of::<IntOrBool<AsInt>>());
assert_eq!(size_of::<IntOrBoolData>(), size_of::<IntOrBool<AsBool>>());
assert_eq!(size_of::<IntOrBoolData>(), 1); // Just one byte
// === FFI boundary: receive from C ===
let from_c = receive_from_ffi();
let safe_value = from_c.into_safe();
// === Application code: fully type-safe, no unsafe ===
match safe_value {
SafeIntOrBool::Bool(b) => {
// Can only call process_bool with IntOrBool<AsBool>
assert_eq!(process_bool(b), "yes");
}
SafeIntOrBool::Int(i) => {
// Can only call process_int with IntOrBool<AsInt>
let _ = process_int(i);
}
}
// === Type-safe conversions within Rust ===
let int_val = IntOrBool::from_int(1);
// Cannot pass IntOrBool<AsInt> to process_bool - won't compile:
// process_bool(int_val); // Error: expected IntOrBool<AsBool>, found IntOrBool<AsInt>
// Must explicitly convert, which validates the value
if let Some(bool_val) = int_val.try_into_bool() {
assert_eq!(process_bool(bool_val), "yes");
}
// Invalid conversion is caught at the conversion point
let int_val = IntOrBool::from_int(42);
assert!(int_val.try_into_bool().is_none()); // 42 is not a valid bool
// === FFI boundary: send back to C ===
let int_val = IntOrBool::from_int(42);
let doubled = IntOrBool::from_int(process_int(int_val));
send_to_ffi(doubled.into());
}

* last written through that field, or
* written through a field whose bytes are valid when reinterpreted as the target field's type

If the active field is uncertain, use explicit validity checks.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make another required rule which enforces that explicit validity checks are performed before any reads if we don't force that here.

It's important to enforce that anyone writing code be encouraged to write a deviation report.

updated example based on comment from @PLeVasseur
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants