References

Pointers can be categorized into two types:

  1. Owning
  2. Nonowning

With owning pointers (Box<T>s, Vecs, Strings, etc), when the owner is dropped, the referent goes with it.

Nonowning pointers on the other hand have no effect on their referents' lifetimes.

Terminology: Nonowning pointer types are called references.

References must never outlive their referents. Rust refers to creating a reference to some value as borrowing the value: what gets borrowed, must eventually be returned to the owner.

References let you access values without affecting their ownership.

There are two kinds of references:

  1. Shared &T
  2. Mutable &mut T

A shared reference lets you read but not modify its referent. There is no limit to the number of shared references that can refer to the same value. Shared references are Copy type.

A mutable reference let you read and modify its referent. But, if a value is the referent of a mutable reference, you may not have any other references of any sort to the value active at the same time. Mutable reference are not Copy.

The distiction between can be thought of as a multiple readers v. single writer rule.

When one or more shared references to a value exist, not even its owner can modify it. The value is locked.

When a mutable reference to a value exists, only the reference itself may access it; not even its owner.

Concept: When a value is passed to a function in way that moves ownership of the value to the function, we say that it's passed by value. When a function is passed a reference to a value, we say that it's passed by reference.

References as Values

Rust References vs. C++ References

Implicit behavior: Since references are so widely used in Rust, the . operator implicitly dereferences its left operand, if needed.

Shorthand: Provided the above implicit behavior, for a reference named some_ref of type &T, where T has a field named x, the following two statements are equivalent:

  • some_ref.x
  • (*some_ref).x

Implicit behavior: The . operator will also implicitly borrow a reference to its left operand, if needed for a method call.

Shorthand: Provided the above implicit behavior, given a mutable value named v of type Vec<u64>, the following two calls to Vec's sort method are equivalent:

  • v.sort()
  • (&mut v).sort()

Assigning References

Assigning to a Rust reference makes it point at a new value:


#![allow(unused)]
fn main() {
let x = 10;
let y = 20;
let mut r = &x;
println!("r equals {}", *r);
r = &y; // assign to r
println!("r equals {}", *r);
}

References to References

Rust allows references to references, and the . operator follows as many references it needs to find the target value:


#![allow(unused)]
fn main() {
struct Point { x: usize, y: usize }
let point = Point { x: 1000, y: 750 };
let r: &Point = &point;
let rr: &&Point = &r;
let rrr: &&&Point = &rr;
println!("x equals {}", rrr.x);
}

Comparing References

Much like the . operator, Rust's comparison operators will also "see through" any number of references as are necessary, as long as both operands have the same type.

If you actually want to know whether to references point to the same address in memory, use std::ptr::eq, which compares the references as addresses.


#![allow(unused)]
fn main() {
let x = 10;
let y = 10;
let rx = &x;
let ry = &y;
let rrx = &rx;
let rry = &ry;
println!("rrx and rry are equal? {}", rrx == rry);
println!("addresses are equal? {}", std::ptr::eq(rrx, rry));
}

References are Never Null

In Rust, if you need a value that is either a reference to something or not, use the type Option<&T>.

At the machine level, Rust represents None as a null pointer, and Some(r), where r is a &T value, as the nonzero address.

Borrowing References to Arbitrary Exceptions

References to Slices and Trait Objects

Term: A fat pointer is a two-word (2 * usize) value on the stack that carries the address of its referent, along with some further information necessary to to put the value ot use.

There are two kinds of fat pointers:

  1. Slice references
  2. Trait objects

A reference to a slice is a fat pointer:

  • 1st word: The starting address of the slice
  • 2nd word: The slice's length

Term: A trait object is a fat pointer referencing a value that implements a certain trait. A trait object carries:

  • 1st word: A value's address
  • 2nd word: A pointer to the trait's implementation appropriate to the pointed-to value for invoking the trait's methods

Reference Safety

The following sections pertain to Rust's reference rules and how it foils any attempt to break them.

Borrowing a Local Variable

Rust tries to assign each reference type in your program a lifetime that meets the contraints imposed by how it's used.

Term: A lifetime is some stretch of a program for which a reference could be safe to use; eg: a lexical block, a statement, an expression, the scope of some variable, etc.

Lifetimes are figments of Rust's imagination; they only exist as part of the compilation process and have no runtime representation.

Receiving References as Parameters

Term: A static is Rust's equivalent of a global (as is, lifetime, not visibility) variable. It's a value that's created when the program starts and lasts until the program terminates.

Some rules for statics (there are more):

  • Every static must be initialized at the time of declaration
  • Mutable statics are not thread-safe and may only be accessed within an unsafe {} block

Syntax: The following code is a general syntax for specifying a function parameter's lifetime:

fn f<'a>(p: &'a i32) { ... }

Here, we'd say that the lifetime 'a is a lifetime parameter of f. We can read <'a> as "for any lifetime 'a, so in the above expression, we're defining f as a function that takes a reference to an i32 with any given lifetime 'a.

Passing References as Arguments

You only need to worry about lifetime parameters when defining functions and types; when using them, Rust infers the lifetimes for you.

Returning References

Implicit behavior: When a function takes a single reference as an argument, and returns a single reference, Rust assumes that the two must have the same lifetime. This means that the following two expressions are equivalent:

fn smallest<'a>(v: &'a [i32]) -> &'a i32 { ... }

fn smallest(v: &[i32]) -> &i32 { ... }

Structs Containing References

Whenever a reference type appears inside another type's definition, you must write out its lifetime.

Given the above statement, we know that the following will fail to compile:


#![allow(unused)]
fn main() {
struct S {
    r: &i32
}

let x = 10;
let s = S { r: &x };
println!("{}", s.r);
}

The fix here is to provide the lifetime parameter of r in the definition of S:


#![allow(unused)]
fn main() {
struct S<'a> {
    r: &'a i32
}

let x = 10;
let s = S { r: &x };
println!("{}", s.r);
}

A type's lifetime parameters always reveal whether it contains references with interesting (aka, 'static) lifetimes, and what those lifetimes can be.

Distinct Lifetime Parameters

When defining a types or functions that have or receive multiple references, a distinct lifetime parameter should be defined for each.

// Types
struct S<'a, 'b> {
    x: &'a i32;
    y: &'b i32
}

// Functions
fn f<'a, 'b>(
    x: &'a i32,
    y: &'b i32,
) -> &'a i32 {
    r
}

Omitting Lifetime Parameters

Shorthand: If you function doesn't return any references (or other types that require lifetime parameters), then you never need to write out lifetimes for the parameters.


#![allow(unused)]
fn main() {
struct S<'a, 'b> {
    x: &'a i32,
    y: &'b i32
}

fn sum_r_xy(r: &i32, s: S) -> i32 { r + s.x + s.y }
}

The above is shorthand for:

fn sum_r_xy<'a, 'b, 'c>(r: &'a i32, s: S<'b, 'c>) -> i32 { ... }

Shorthand: If there's only a single lifetime that appears among your function's parameters, then Rust assumes any lifetimes in the return must be the one defined.


#![allow(unused)]
fn main() {
fn first_third(point: &[i32; 3]) -> (&i32, &i32) {
    (&point[0], &point[2])
}
}

The above is shorthand for:

fn first_third<'a>(point: &'a [i32; 3]) -> (&'a i32, &'a i32) { ... }

Shorthand: If your function is a method on some type and takes its self parameter by reference, Rust assumes that self's lifetime is the one to give any references in the return value.


#![allow(unused)]
fn main() {
struct StringTable {
    elements: Vec<String>,
}

impl StringTable {
    fn find_by_prefix(&self, prefix: &str) -> Option<&String> {
        for i in 0 .. self.elements.len() {
            if self.elements[i].starts_with(prefix) {
                return Some(&self.elements[i]);
            }
        }
        None
    }
}
}

The above method's signature is shorthand for:

fn find_by_prefix<'a, 'b>(&'a &self, prefix: &'b str) -> Option<&'a String>

Sharing vs. Mutation

Shared access

A value borrowed by shared references is read-only.

Across the lifetime of a shared reference, neither its referent, nor anything reachable from that referent, can be changed by anything.

Mutable access

A value borrowed by a mutable reference is reachable exclusively via that reference.

Across the lifetime of a mutable reference, there is no other usable path to its referent, or to any value reachable from there.

The only references whose lifetimes may overlap with a mutable reference are those you borrow from the mutable reference itself.