When Ferrous Metals Corrode, pt. IV

Intro

This part summarizes the fifth chapter of "Programming Rust, 2nd Edition", "References". There was talk of references before, this chapter provides some additional detail around shared and mutable references and lifetimes.

References to Values

We have encountered references previously. There are two types

Shared

r/o references, can be more than one to the same var: &T. Those are Copy-able

Mutable

r/w references; there can only ever be one active: &mut T. Mutable refs are not Copy

Only ever having one writer to a memory location helps with memory safety, e.g. eliminating torn writes resp. mem corruption.

Conveniently the dot operator v.field automatically dereferences if necessary. Iterating over a shared references will itself produce shared refs to keys/values.

References may be nested, i.e. you can have references to references. The dot operator will automatically deref any level of references.

Comparing references auto-derefs as well: &x = &y= if the values of x and y are equal. There's the std::ptr::eq function if we actually want to compare mem addresses.

Except for unsafe code, references can't ever be NULL. Use Option<&T> if you need a reference that might be None

Interestingly, references can be borrowed to arbitrary expressions, e.g. functions. Rust will create an anonymous value to then borrow a ref against.

References can be single addresses or fat pointers, e.g. references to slices that have an address and a length

Reference Safety

In order to keep references safe Rust imposes some rules, e.g. can't borrow a ref from a local var and take it out of the vars scope. Rust assigns each ref a lifetime during compilation, this is part of it's type. If you got a var x and a ref r there's three lifetimes in play:

  1. Lifetime of x

  2. Lifetime of r

  3. Lifetime of the reference type

The compiler enforces several rules to make this safe:

  1. the lifetime of &x mustn't outlive x itself.

  2. Also, if you store &x in r, the ref types lifetime must be good for the lifetime of r.

The first rule constrains the max. lifetime, while the second constrains the min. lifetime.

Lifetimes for function params can also be spelt out explicitly. In this example func f takes any lifetime:

fn f<'a>(p: &'a i32) { ... }

In this example we promise that p could have any lifetime at all; f should work even with the smallest possible lifetime, maybe one where p is just enclosing the call to f. Since we promise this, the compiler will check that f's usage of p matches that constraint.

In the following example we specify that p must have a static lifetime, which means live through the whole program execution:

fn f(p: &'static i32) { ... }

When a func takes and returns a single ref, Rust will assume they must have the same lifetime.

Lifetimes must be explicitly specified if a ref type appears inside another types definition, e.g. in a struct:

struct S1 {
    r: &'static i32
}

struct S2<'a> {
    r: &'a i32
}

Careful, if you need independent lifetimes you need to spell those out as well:

struct S<'a, 'b> {
    x: &'a i32,
    y: &'b i32
}

If both refs had used a lifetime of 'a Rust will ensure that indeed both .x and .y have compatible lifetimes.

Sharing Versus Mutation

To prevent dangling pointers, Rust will catch the case where a var with a ref is moved, and will complain.

Another safety mechanism is the rules around shared and mut refs. If you have a value with ref, the following holds depending on the type of ref:

Shared access

var read-only. Across the lifetime of the shared ref, nothing must change the variable

Mutable access

exclusive access to var. Across the lifetime of a mut ref, no other usable path to the var exits

This also prevents bugs eg. seen with C memcpy or strcpy where mem regions overlap. And of course this is great for concurrent programming, because it prevents data races at compile time.

Coda: Taking Arms Against a Sea of Objects

Rusts rules around data ownership and move semantics make it harder to write arbitrary unstructered object graphs, and that's a good thing.