When Ferrous Metals Corrode, pt. XII

Intro

For this post I'm looking at Chapter 13. Utility Traits in the Programming Rust book.

I don't expect radically new stuff here – but, given the importance of traits, rather some practical things into how idiomatic Rust should look like.

The book defines three broad categories of utility traits:

Language extension traits

Things like traits for operator overloading but also Deref or Drop, serving as language extension points

Marker traits

Traits for capturing constraints that can't be captured otherwise, like Copy

Public vocabulary traits

These are not special in any way and could in principle be user-defined. Rather they serve to form a common vocabulary around common problems, e.g. Default

Drop

This is the interface for destructors, i.e. it lets you define behaviour when a value loses it's owner. This happens when it goes out of scope, when a vector gets truncated, etc.

The trait std::ops::Drop is defined as follows:

trait Drop {
    fn drop(&mut self);
}

When a value is called the .drop() method will be called, before the contained fields are dropped – i.e. at the time when the drop method is called the value is still fully initialized, and afterwards the values' fields are freed.

A reason to implement the drop interface is when handling values Rust doesn't know about, for instance with OS resources. Example, from the standard library here's a FileDesc, with an fd field with a type of c_int and an impl of the Drop trait which closes the fd (in unsafe code):

struct FileDesc {
    fd: c_int,
}


impl Drop for FileDesc {
    fn drop(&mut self) {
        let _ = unsafe { libc::close(self.fd) };
    }
}

Sized

This is for types with a known fixed size – think chars, u64, etc., but also a Vec<T> (which owns a heap-alloc buffer but is itself a fixed-size fat ptr).

The std::marker::Sized is a marker trait, it has no methods or similar, and is only used as a bound. User-defined types can't implement Sized, this is handled by Rust automatically.

Unsized types are for instance str or array slices [T], but also dyn types (targets of trait objects).

Unsized values can't be stored in variables, they need to be handled via refs or Boxes (which both are Sized).

When using generics type vars are by default assumed to be Sized – writing struct S<T> { ... } really defaults to struct S<T: Sized> { ... }.

If you do not want to require Sized, use questionably Sized: struct S<T: ?Sized> { ... } to opt out of the Sized bound.

Clone

This trait is for types that can make copies of itself

trait Clone: Sized {
    fn clone(&self) -> Self;
    fn clone_from(&mut self, source: &Self) {
        *self = source.clone()
    }
}

Because Clone extends Sized it follows that Self must be Sized.

Use #[derive(Clone)] to get a default impl of cloning.

Many built-in types are Clone, besides primitive types also e.g. Vecs and HashMap.

Some types are non-Clone as you'd expect, e.g. Mutex or File. Note that Clone must be infallible; that's why Files – which can be copied – are not Clone. Instead they return Results which may be Ok or Err.

Copy

Copy is another marker trait; it extends Clone. It has special meaning to the language (think ownership) and user-defined types must only impl the trait if they can be copied with a simple shallow copy.

Drop and Copy are mutually exclusive: the language assumes that special Drop methods would also need special copy methods, and those are verboten.

Often types will just have default impls for both Copy and Clone: #[derive(Copy, Clone)]

Deref and DerefMut

These traits control dereferencing with the * (explicit deref) and . (possible implicit deref) operations (with shared and mut refs respectively).

The traits are defined as below. The idea being that the type implementing Deref is some sort of container that will return an element upon dereferencing.

trait Deref {
    type Target: ?Sized;
    fn deref(&self) -> &Self::Target;
}

trait DerefMut: Deref {
    fn deref_mut(&mut self) -> &mut Self::Target;
}

An example. Given a type Selector that contains a vector of elements and a moving index that points to one of them:

struct Selector<T> {
    /// Elements available in this `Selector`.
    elements: Vec<T>,

    /// The index of the "current" element in `elements`. A `Selector`
    /// behaves like a pointer to the current element.
    current: usize
}

Selector could implement dereferencing thusly, with example usage:

use std::ops::{Deref, DerefMut};

impl<T> Deref for Selector<T> {
    type Target = T;
    fn deref(&self) -> &T {
        &self.elements[self.current]
    }
}

impl<T> DerefMut for Selector<T> {
    fn deref_mut(&mut self) -> &mut T {
        &mut self.elements[self.current]
    }
}


let mut s = Selector { elements: vec!['x', 'y'], current: 1 };

// Because `Selector` implements `Deref`, we can use the `*` operator to
// refer to its current element.
assert_eq!(*s, 'y');

// Change the 'y' to a 'w', by assigning to the `Selector`'s referent.
*s = 'w';

It would be generic over T, and the Target result of the dereferencing would be T. Upon dereferencing it'd return the element pointed at by the 'current' index.

Caveat: Deref traits will be used by Rust to resolve types, but not to satisfy boundary conditions.

Types work ok – here Rust sees a &Selector<&str> being passed to show_it(&s), resolves this via Deref<Target=str> and uses show_it(s.deref())

let s = Selector { elements: vec!["good", "bad", "ugly"],
                   current: 2 };
fn show_it(thing: &str) { println!("{}", thing); }
show_it(&s);

However Deref as a boundary doesn't:

use std::fmt::Display;
fn show_it_generic<T: Display>(thing: T) { println!("{}", thing); }
show_it_generic(&s);

While &str implements Display, Selector<&str> doesn't, and the type coercion we saw above doesn't get applied here.

Workarounds are either to cast with as or explicitly deref when using this:

show_it_generic(&s as &str);
// or
show_it_generic(&*s);

Default

We can give types a default value with this trait, think 0 for ints, empty strings and similar. Rusts collection types all implement that trait. A common use case is also for large structs (param bags and such) that often have similar values.

Example, the String type:

impl Default for String {
    fn default() -> String {
        String::new()
    }
}

If your type implements Default, the standard lib will provide it for the Ref/Box/Cell types. Structs don't automatically get a Default but can have one derived: #[derive(Default)]

AsRef and AsMut

Types that implement AsRef<T> promise that they can efficiently provide a reference to a T (AsMut is the same, but for mut refs).

A usecase that makes a lot of sense to me is creating funs that are flexible in the params they accept. E.g. the open() function is defines as:

fn open<P: AsRef<Path>>(path: P) -> Result<File>

So, it will accept any P that can provide a ref to a Path object. That can be a Path object, but also String, because String has this:

impl AsRef<Path> for String {
    #[inline]
    fn as_ref(&self) -> &Path {
        Path::new(self)
    }
}

Since String implements AsRef<Path> open will also accept Strings as input. The effect is a bit similar to method overloading, however the adaptation here happens not in the called fun but on the parameter. One thing to be careful is that an adapting type now depends on the adaptee – the target type mustn't change it's API, otherwise the adapter might not be able to fulfull it's promise.

The standard library has a blanket implementation such that if T implements AsRef<U> then also &T will implement AsRef<U> (i.e. it will automatically deref).

Borrow and BorrowMut

Borrow and BorrowMut are similar to AsRef/AsMut above, but has additional restrictions, specifically a type should only implement Borrow if the adapter and the target type hash and compare the same way. This is of course of interest when a value should be used as a key in a hash map. Every &mut T type also implements Borrow<T> so if you already have a mut ref, you can get a shared ref for collection lookup as well.

From and Into

These are also traits to adapt a value from one type to another. However here we're not passing around refs but the full value.

Use Into to make funs more flexible in which args they accept. Example lets say we have a ping fun, then it could specify to be happy with any arg that we can get an ipv4 addr out of:

use std::net::Ipv4Addr;
fn ping<A>(address: A) -> std::io::Result<bool>
    where A: Into<Ipv4Addr>
{
    let ipv4_address = address.into();
    ...
}
// ex. usage with [u8; 4]
println!("{:?}", ping([66, 146, 219, 98]));

With this boundary ping can not only accept the canonical Ipv4Addr but also [u8; 4] as that happens to implement Into<Ipv4Addr>

The From trait is for getting generic constructors, e.g. let addr1 = Ipv4Addr::from([66, 146, 219, 98]);

If a type implements From, the std lib will auto-implement the corresponding Into trait.

Because From/Into take ownership of the values the trait impl is free to re-use any underlying resources (e.g.: Into<Vec<u8>> for String reuses the String heap buffer). On the other hand, per contract From/Into don't have to be efficient, i.e. they may also copy, allocate, …

The std lib also has an interesting blanket From impl to put values that impl Error into a box:

impl<'a, E: Error + Send + Sync + 'a> From<E> 
  for Box<dyn Error + Send + Sync + 'a> {
    fn from(err: E) -> Box<dyn Error + Send + Sync + 'a> {
        Box::new(err)
    }
}

The ? operator uses this automatically to convert various types of Errors into a generic error type like we looked at in the chapter about error handling:

type GenericError = Box<dyn std::error::Error + Send + Sync + 'static>;

From/Into require that their conversion cannot fail ("infallible").

TryFrom and TryInto

These are similar to From/Into above, but may fail. If a type implements TryFrom, the std lib provide TryInto for free. Their conversion methods return a Result.

Example: convert i64 to i32, return max. i32 on overflow (the only error that can occur):

let smaller: i32 = huge.try_into().unwrap_or(i32::MAX);

ToOwned

If you call clone() on a ref you get an owned copy of the target value (if indeed the value implements Clone). For fat pointers like &str that's not really possible as str is unsized. The ToOwned trait is a bit more flexible:

trait ToOwned {
    type Owned: Borrow<Self>;
    fn to_owned(&self) -> Self::Owned;
}

This says that anything that a value can be Borrowed from can be returned. The str type implements ToOwned<Owned=String> so with &str you'd get back a String which is probably the most useful thing to do here.

Borrow and ToOwned at Work: The Humble Cow

The std lib has a Cow (clone on write) type – an enum that is either a borrowed or an owned value. It's generic over a B and either borrows a shared ref to a B, or owns a B (that one could borrow a ref from). And, it has a to_mut() method to get a mutable reference. If the Cows' B was borrowed it'll get its own copy and borrows a mut ref from there – hence copy on write.

Example code from the std lib – depending on the input, the abs fun may mutate values or not (if they're already positive):

use std::borrow::Cow;

fn abs_all(input: &mut Cow<[i32]>) {
    for i in 0..input.len() {
        let v = input[i];
        if v < 0 {
            // Clones into a vector if not already owned.
            input.to_mut()[i] = -v;
        }
    }
}

// No clone occurs because `input` doesn't need to be mutated.
let slice = [0, 1, 2];
let mut input = Cow::from(&slice[..]);
abs_all(&mut input);

// Clone occurs because `input` needs to be mutated.
let slice = [-1, 0, 1];
let mut input = Cow::from(&slice[..]);
abs_all(&mut input);

// No clone occurs because `input` is already owned.
let mut input = Cow::from(vec![-1, 0, 1]);
abs_all(&mut input);

Coda

Ok, this was a long chapter. Much in Rust seems to revolve around how we describe our data so we can manage it well – and the utility traits here have an important role in this context.