Attributes

Any item in a Rust program can be decorated with attributes, which are Rust's catch-all syntax for writing miscellaneous instructions and advice to the compiler.

Tip: To attach an attribute to a whole crate, add it at the top of the main.rs or lib.rs file, before any items and write #! instead of #:


#![allow(unused)]
fn main() {
// src/lib.rs
#![allow(non_camel_case_types)]
pub struct weird_type_name {  }
}

Tip: To include a module only when testing, use #[cfg(test)].

#! can also be used inside functions, structs, etc, but it's only typically used at the beginning of a file to attach an attribute to the whole module or crate.

Some attributes must use #! because they can only be applied to an entire module or crate. For example, #![feature] is used to turn on unstable features of the Rust language and libraries.

Conditional Compilation

Conditional compilation is configured using the #[cfg] attribute:


#![allow(unused)]
fn main() {
#[cfg(target_os = "macos")]
mod mac_stuff; // will only be be included if the target is macOS
}

Links

Cargo

Behavior: By default, cargo build looks at the files in your src directory and figures out what to build. When it sees src/lib.rs, it knows that it needs to build a library.

Behavior: Cargo will automatically compile files inside src/bin when you run cargo build. The executables created from the files in src/bin can be run using cargo run --bin my_bin.

Useful Commands

CommandResult
rustup updateUpdates Rust
cargo new --bin <package_name>Creates a new package
cargo package --listList all files included in a package
cargo doc --no-deps --openCreate HTML documentation for your project; the output gets saved to target/doc

Rust Tools

Automatic code formatting

$ rustup component add rustfmt-preview

Installs both rustfmt for formatting rust, and cargo-fmt for formatting Cargo configurations.

Run it with $ cargo fmt

Automatic code fixing and version migrations

Run rustfix with: $ cargo fix

Automatc code improvements

$ rustup component add clippy-preview

Install clippy.

Run it with: $ cargo clippy

Common Profile Settings

debug

Controls the -g option sent to rustc, which turns debug symbols on and off. Possible values: true | false

Links

Recommended Creates

Binary Data, Compression, and Serialization

byteorder

Offers traits that add methods to all readers and writers for binary input and output.

flate2

Provides adapter methods for reading and writing gzipped data.

serde

Used for serialization; it converts back and forth between Rust structs and bytes.

Networking

mio

Support for asynchronous input and output to create high-performance servers. It provides a simple event loop and asynchronous methods for reading, writing, connecting, and accepting connections. (basically an asynchronous copy of the whole networking API)

tokio

Wraps the mio event loop in a futures-based API.

reqwest

Provides a beautiful API for HTTP clients.

iron

Higher-level server framework with support for things like middleware traits.

websocket

Implements the WebSocket protocol.

Translations

&T


Immutable reference to a value of type T.

&[T]


Reference to a slice containing data of type T.

impl<T> Queue<T>


For any type T, here are some methods available on Queue.

fn say_hello(out: &mut Write)


This function's parameter is a mutable reference to any value that implements the Write trait

fn min<T: Ord>(value1: T, value2: T)


This function can be used with arguments of any type T that implements the Ord trait

fn top_ten<T: Debug + Hash + Eq>(values: &Vec<T>)


This function can be used with an argument that is a vector reference of any type T, as long as T implements the Debug, Hash, and Eq traits

impl<W: Write> WriteHtml for W


Here's an implementation of the WriteHtml trait for any type W that implements Write

trait Creature: Visible {


Every type that implements Creature must also implement the Visible trait. Creature is a subtrait of (extends) Visible.

trait Iterator {
    type Item;

Item is an associated type of the Iterator trait. Any type that implements Iterator must specify the Item type.

impl Iterator for Args {
    type Item = String;

The implementation of Iterator for Args has an associated Item type of String.

fn dump<I>(iter: I) where I: Iterator<Item=String>


The type parameter I must be an iterator over String values

trait Mul<RHS=Self> {


The type parameter RHS of this trait defaults to Self.


#![allow(unused)]
fn main() {
pub trait Rng {
    fn next_u32(&mut self) -> u32;
}
pub trait Rand: Sized {
    fn rand<R: Rng>(rng: &mut R) -> Self;
}
}

The Rand trait uses the Rng trait as a bound. Rand and Rng are buddy traits.

impl<T> Add for Complex<T> where T: Add<Output=T>


Overloads the + operator for values of Complex<T> types, where T must already implement the Add (+ operator) trait.

trait PartialEq<Rhs: ?Sized = Self>


This is a trait signature whose Rhs type parameter does not have to be a sized type. That means this trait could be implemented for types like &str or &[T]. We'd say that Rhs is questionably sized.


#![allow(unused)]
fn main() {
impl<T, E, C> FromIterator<Result<T, E>> for Result<C, E>
    where C: FromIterator<T>
{ ... }
}

If you can collect items of type T into a collection of type C (where C implements the FromIterator<T> trait), then you can collect items of type Result<T, E> into a single result of type Result<C, E>.

Hidden Code

A lot of Rust code is ostensibly terse and clean-looking (especially relative to C++). This is great, but as a Rust newbie, the implicit assumptions made by the compiler can seem a bit blackbox-y, which can make things difficult to reason about.

Here, I'll try to jot down examples of the aforementioned implicit decisions made by the compiler by comparing idiomatic code with it's fully-expressed, verbose syntax.

Return Type Omission

A function declaration whose return type is omited is shorthand for returning the unit type.

The codeUnder the hood
fn my_fn() { .. }fn my_fn() -> () { .. }

Automatic Dereferencing (The . Operator)

The . operator implicitly dereferences its left operand, if needed. e.g. for a reference variable named some_ref of type &T, where T has a field named x:

The codeUnder the hood
some_ref.x(*some_ref).x

Automatic Referencing (The . Operator)

The . operator also implicitly borrows a reference to its left operand, if needed for a method call.


#![allow(unused)]
fn main() {
let mut x = vec![1993, 1963, 1991];
let mut y = vec![1993, 1963, 1991];
x.sort();
(&mut y).sort();
assert_eq!(x, y);
}
The codeUnder the hood
v.sort()(&mut v).sort()

Reference Traversal (The . Operator)

. will follow as many references as it takes to reach its target.


#![allow(unused)]
fn main() {
struct Number { value: usize }
let n = Number { value: 999 };
let r: &Number = &n;
let rr: &&Number = &r;
let rrr: &&&Number = &rr;
assert_eq!(rrr.value, (*(*(*rrr))).value);
}
The codeUnder the hood
rrr.value(*(*(*rrr))).value

Reference Traversal (Comparison Operators)

Rust's comparison operators can also "see through" references, as long as both operands have the same type.


#![allow(unused)]
fn main() {
let x = 10;
let y = 10;
let rx = &x;
let ry = &y;
let rrx = &rx;
let rry = &ry;

assert!(rrx <= rry);
assert!(*(*rrx) <= *(*rry));
}
The codeUnder the hood
rrx <= rry*(*rrx) <= *(*rry)

Single Reference Parameter (Omitting Lifetime Parameters)

When a function takes a single reference as an argument, and returns a single reference, Rust assumes that the two must have the same lifetime.

The codeUnder the hood
fn smallest(v: &[i32]) -> &i32fn smallest<'a>(v: &'a [i32]) -> &'a i32

No Return Reference (Omitting Lifetime Parameters)

When a function doesn't return any references, Rust doesn't need explicit lifetimes.

The codeUnder the hood
fn sum_r_xy(r: &i32, s: S) -> i32fn sum_r_xy<'a, 'b, 'c>(r: &'a i32, s: S<'b, 'c>) -> i32

Single Lifetime (Omitting Lifetime Parameters)

If there's only a single lifetime that appears among a function's parameters, Rust assumes any lifetimes in the return value msut be that one.

The codeUnder the hood
fn first_third(point: &[i32; 3]) -> (&i32, &i32)fn first_third<'a>(point: &'a [i32; 3]) -> (&'a i32, &'a i32)

Accepting self by Reference (Omitting Lifetime Parameters)

If a function is an impl method on some type and that takes its self parameter by reference, Rust assumes that self's lifetime is the one to give the method's return value.

The codeUnder the hood
fn find_by_prefix(&self, prefix: &str) -> Option<&String>fn find_by_prefix<'a, 'b>(&'a self, prefix: &'b str) -> Option<&'a String>

Tuple struct Constructors

When defining a tuple-like struct, Rust implicitly defines a function that acts as the type's constructor.

The codeUnder the hood
struct Bounds(usize, usize)fn Bounds(x: usize, y: usize) -> Bounds

Self

Inside impl blocks, Rust automatically creates a type alias of the type for which the impl block is associated called Self.


#![allow(unused)]
fn main() {
impl<T> Queue<T> {
  pub fn new() -> Self { .. }
}
}
The codeUnder the hood
pub fn new() -> Self { .. }pub fn new() -> Queue<T> { .. }

Overloaded Operators

Overloaded operators (even the basic ones) are implicitly calling a method specified by their corresponding generic trait.

The codeUnder the hood
x * yMul::mul(x, y)
x += yx.add_assign(y)

The ? operator

Under the hood, the ? operator calls From::from on the error value to convert it to a boxed trait object, a Box<dyn error::Error>, which is polymorphic -- that means that lots of different kinds of errors can be returned from the same function because all errors act the same since they all implement the error::Error trait.

async Lifetimes

async function lifetimes have different rules if one of its arguments is reference of is 'non-static.

This function:


#![allow(unused)]
fn main() {
async fn foo(x: &u8) -> u8 { *x }
}

Is this under the hood:


#![allow(unused)]
fn main() {
fn foo<'a>(x: &'a u8) -> impl Future<Output = u8> + 'a {
  async move { *x }
}
}

Pinning

A lot happens under the hood when using async. Let's look at this incomplete code, where we run two futures in sequence:


#![allow(unused)]
fn main() {
let fut_one = /* ... */;
let fut_two = /* ... */;
async move {
  fut_one.await;
  fut_two.await;
}
}

Under the hood, rust creates an anonymous type representing the async { } block and its combined possible states:


#![allow(unused)]
fn main() {
struct AnonAsyncFuture {
  fut_one: FutOne,
  fut_two: FutTwo,
  state: State,
}

enum AnonState {
  AwaitingFutOne,
  AwaitingFutTwo,
  Dont,
}
}

Then it implements Future for the anonymous type and provides a poll method:


#![allow(unused)]
fn main() {
impl Future for AnonAsyncFuture {
  type Output = ();

  fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> {
    loop {
      match self.state {
        State::AwaitingFutOne => match self.fut_one.poll(..) {
            Poll::Ready(()) => self.state = State::AwaitingFutTwo,
            Poll::Pending => return Poll::Pending,
        }
        State::AwaitingFutTwo => match self.fut_two.poll(..) {
            Poll::Ready(()) => self.state = State::Done,
            Poll::Pending => return Poll::Pending,
        }
        State::Done => return Poll::Ready(()),
      }
    }
  }
}
}

When poll is first called, it'll call fut_one's poll. If it's still pending, it'll return. Future calls to poll will pick up where the previous poll left off, based on state.

Rust Trivia

What are the 3 types used to represent a sequence of values, and what are their generic type annotations?


  1. Array [T; N]
  2. Vector Vec<T>
  3. Slice &[T]

How do you check the current size and capacity of any sequential-type value?


.len() and .capacity()

What makes a String and a &str unique? What are their effective underlying types?


  1. A String is just a Vec<u8> with the guarantee that the data is well-formed UTF-8.
  2. A &str is just a slice &[u8] of a String.

Given let x = y;, under what condition would it be true that y did not become uninitialized?


If the type of y implements the Copy trait.

Given let s = "hello!";, what is the type of s?


s is a &str whose pointer refers to preallocated, read-only memory on the stack.

How do you get the size of any data type?


std::mem::size_of::<T>();

What's a fat pointer?


A fat pointer is a pointer to a slice (a region of an array or vector). It's a two-word value on the stack comprised of:

  1. A pointer to the slice's first element
  2. The number of elements in the slice

What's the difference between Arc and Rc types?


Arc (atomic reference count) is safe to share between threads directly, whereas a Rc uses faster non-thread-safe code to update its reference count.

When defining a type (struct), when is it required that a field's lifetime be specified?


Lifetimes must be specified when a field is a reference type. e.g.

struct RefPoint<'a, 'b> {
  x: &'a f64,
  y: &'b f64,
}

What risk is sometimes posed by using reference count types?


If two Rc types point to each other, they will keep each other's ref count above zero and neither will be freed. This is called a reference cycle.

Given two reference variables x and y, how do I check to see if they point to the same value in memory?


std::ptr::eq(x, y)

What are the two ways for closures to get data from enclosing scopes?


  1. Moves
  2. Borrowing

What are the three variants of IntoIterator implementations?


  1. Shared reference
  2. Mutable reference
  3. By value

When should you use either Path or OsStr?


For absolute and relative paths, use Path. For an individual component of a path, use OsStr.

How are bool values stored in memory, and why?


bool values are stored as a whole byte so that pointers to them may be created.

What mechanism should you reach for to allow for shared ownership of a value?


Rc<T> or Arc<T> (if sharing across multiple threads)

What mechanism allows us to mutate the value inside of an Rc<T>? What about an Arc<T>?


For an Rc<T>, interior mutability can be facilitated by a RefCell<T>. For an Arc<T>, you'd reach for a Mutex<T>.

What special ability does a Pin'd object have?


Pinned (i.e. immovable) objects can have pointers to their own fields. e.g.


#![allow(unused)]
fn main() {
struct MyFuture {
  a: i32,
  ptr_to_a: *const i32, // I point to my own `a`
}
}

When would you use ArcWake (from the futures crate) trait?


Use ArcWake when you need an easy way to construct a Waker.

What is the actual return type of this function?


#![allow(unused)]
fn main() {
async fn get_five() -> u8 { 5 }
}

Returns value of type impl Future<Output = u8>.

How is an async function in terms of lifetimes if one of its arguments is a > reference or non-'static value?


Unlike regular functions, async functions whose parameters are references or non-'static return a Future which is bounded by the lifetime of the arguments. Meaning, the future returned from an async fn must be .awaited while its non-'static arguments are still valid.

What's the workaround for async functions' non-static lifetime rules?


An async function's Future return value can be make 'static by moving the non-static (or reference) values into an async block:


#![allow(unused)]
fn main() {
fn work_around() -> impl Future<Output = u8> {
  async {
    let x = 5;
    borrow_x(&x).await
  }
}
}

What's the formal fancy term to describe Rust's form of polymorphism?


Bounded parametric polymorphism.

What's the pattern used as a way to get around the orphan rule?


The newtype pattern, which involves creating a new type in a tuple struct.

I've implemented a newtype, Wrapper, that wraps a Vec<T>, but now I can't use the Vec<T>'s built-in methods! What can I do?


Implement the Deref trait for Wrapper, which would allow us to treat Wrapper exactly like a Vec.

Something about how passing by value cedes all ownership of a value. Use drop as point of reference.

How do I write an impl T function that consumes T (it will no longer be usable by others) and converts it to U?


In the function's signature, you'd have self based by value, which will consume it. (usually, impl functions receive &self)

How do I get the address of a value (say, a String)?



#![allow(unused)]
fn main() {
let txt = String::from("hello world");
let txt_stack = &txt as *const _; // Address of pointer on the stack
let txt_heap = &txt.as_bytes()[0] as *const _; // Address of first charcacter in heap
dbg!((txt_stack, txt_heap));
}

TODO: Go here and add stuff about the use of phantom/types/data


Patterns

Pattern typeExampleNotes
Literal100 "name"Matches an exact value; the name of a const is also allowed
Range0 ... 100 'a' ... 'z'Matches any value in range, including the end value
Wildcard_Matches any value and ignores it
Variablename mut countLike _ but moves or copies the value into a new local variable
ref variableref field ref mut fieldBorrows a reference to the matched value instead of moving or copying it
Binding with subpatternval @ 0 ... 99 ref circle @ Shape::Circle { .. }Matches the pattern to the right of @, using the variable name to the left
Enum patternSome(value) None Pet::Orca 
Tuple pattern(key, value) (r, g, b) 
Struct patternColor(r, g, b) Point { x, y } Card { suit: Clubs, range: n } Account { id, name, .. } 
Reference&value &(k, v)Matches only reference values
Multiple patterns'a' \| 'A'In match only (not valid in let, etc.)
Guard expressionsx if x * x <= r2In match only (not valid in let, etc.)

Operator Overloading

Unary Operators

TraitOperatorEquivalent
std::ops::Neg-xx.neg()
std::ops::Not!xx.not()

Arithmetic Operators

TraitOperatorEquivalent
std::ops::Addx + yx.add(y)
std::ops::Subx - yx.sub(y)
std::ops::Mulx * yx.mul(y)
std::ops::Divx / yx.div(y)
std::ops::Remx % yx.rem(y)
std::ops::AddAssignx += yx.add_assign(y)
std::ops::SubAssignx -= yx.sub_assign(y)
std::ops::MulAssignx *= yx.mul_assign(y)
std::ops::DivAssignx /= yx.div_assign(y)
std::ops::RemAssignx %= yx.rem_assign(y)

Bitwise Operators

TraitOperatorEquivalent
std::ops::BitAndx & yx.bitand(y)
std::ops::BitOr`xy`
std::ops::BitXorx ^ yx.bitxor(y)
std::ops::Shlx << yx.shl(y)
std::ops::Shrx >> yx.shr(y)
std::ops::BitAndAssignx &= yx.bitand_assign(y)
std::ops::BitOrAssign`x= y`
std::ops::BitXorAssignx ^= yx.bitxor_assign(y)
std::ops::ShlAssignx <<= yx.shl_assign(y)
std::ops::ShrAssignx >>= yx.shr_assign(y)

Comparison Operators

TraitOperatorEquivalent
std::ops::PartialEqx == yx.eq(&y)
std::ops::PartialEqx != yx.ne(&y)
std::ops::PartialOrdx < yx.lt(y)
std::ops::PartialOrdx > yx.gt(y)
std::ops::PartialOrdx <= yx.le(y)
std::ops::PartialOrdx >= yx.ge(y)

Indexing Operators

TraitOperatorEquivalent
std::ops::Indexx[y]x.index(y)
std::ops::Index&x[y]*x.index(y)
std::ops::IndexMut&mut x[y]*x.index_mut(y)

Utility Traits

TraitDescription
DropDestructors. Cleanup code that Rust runs automatically whenever a value is dropped.
SizedMarker trait for types with a fixed size known at compile time, as oppose to types (such as slices) that are dynamically sized.
CloneTypes that support cloning values.
CopyMarker trait for types that can be cloned simply by making a byte-for-byte copy of the memory containing the value.
Deref, DerefMutTraits for smart pointer types.
DefaultTypes that have a sensible "default value".
AsRef, AsMutConversion traits for borrowing one type of reference from another.
Borrow, BorrowMutConversion traits like AsRef and AsMut that additionally guarantee consistent hashing, ordering, and equality.
From, IntoConversion traits for transforming one type of value into another.
ToOwnedConversion trait for converting a reference to an owned value.

Common Standard Library Iterators

Free Functions

ExpressionNotes
std::iter::empty()Returns None immediately.
std::iter::once(5)Produces the given value, and then ends.
std::iter::repeat("#9")Produces the given value forever.

std::ops::Range

ExpressionNotes
1..10Endpoints must be an integer type to be iterable. Range includes start value, and excludes end value.

std::ops::RangeFrom

ExpressionNotes
1..Unbounded iteration. Start must be an integer. May panic or overflow if the value reaches the limit of the type.

Option<T>

ExpressionNotes
Some(10).iter()Behaves like a vector whose length is either 0 (None) or 1 (Some(v)).

Result<T, E>

ExpressionNotes
Ok("blah").iter()Similar to Option, producing Ok values.

Vec<T> and &[T]

ExpressionNotes
``TODO

String and &str

ExpressionNotes
``TODO

std::collections::{HashMap, BTreeMap}

ExpressionNotes
``TODO

std::collections::{HashSet, BTreeSet}

ExpressionNotes
``TODO

std::sync::mpsc::Receiver

ExpressionNotes
``TODO

std::io::Read

ExpressionNotes
``TODO

std::io::BufRead

ExpressionNotes
``TODO

std::fs::ReadDir

ExpressionNotes
std::fs::read_dir(path)Produces directory entries.

std::net::TcpListener

ExpressionNotes
listener.incoming()Produces incoming network connections.

Filesystem Access Functions

The following are some of the functions in std::fs and their approximate Unix equivalents. All of these functions return io::Result values. All of these functions call out directly to the operating system.

Creating and deleting

UnixFunctionReturns
mkdircreate_dir(path)Result<()>
mkdir -pcreate_dir_all(path)Result<()>
rmdirremove_dir(path)Result<()>
rm -rremove_dir_all(path)Result<()>
unlinkremove_file(path)Result<()>

Copying, moving, and linking

UnixFunctionReturns
cp -pcopy(src_path, dest_path)Result<u64>
renamerename(src_path, dest_pathResult<()>
linkhard_link(src_path, dest_path)Result<()>

Inspecting

UnixFunctionReturns
realpathcanonicalize(path)Result<PathBuf>
statmetadata(path)Result<Metadata>
lstatsymlink_metadata(path)Result<Metadata>
lsread_dir(path)Result<ReadDir>
readlinkread_link(path)Result<PathBuf>

Permissions

UnixFunctionReturns
chmodset_permissions(path, perm)Result<()>

Random Musings and Stuff I Should Remember

Important Crates

syn - parsing library

Usually used for writing procedural macros, typically in conjunction with quote and proc-macro.

tokio

Plus all of these crates that fall under the umbrella of tokio:

  • hyper: A fast and correct HTTP/1.1 and HTTP/2 implementation for Rust.

  • tonic: A gRPC over HTTP/2 implementation focused on high performance, interoperability, and flexibility.

  • warp: A super-easy, composable, web server framework for warp speeds.

  • tower: A library of modular and reusable components for building robust networking clients and servers.

  • tracing (formerly tokio-trace): A framework for application-level tracing and async-aware diagnostics.

  • rdbc: A Rust database connectivity library for MySQL, Postgres and SQLite.

  • mio: A low-level, cross-platform abstraction over OS I/O APIs that powers tokio.

  • bytes: Utilities for working with bytes, including efficient byte buffers.

  • loom: A testing tool for concurrent Rust code

Other Thoughts

= does not mean assignment!

Stop thinking of let x = y; as "assign the value of y to x". Rather, think of it as "Move the value in memory owned by y to x (thereby giving x ownership)",

Simple Operations and How-Tos

How do I...

use a range as a match subpattern?


#![allow(unused)]
fn main() {
let age = 27;
match age {
    0             => println!("I'm not born yet I guess"),
    n @ 1  ... 12 => println!("I'm a child of age {:?}", n),
    n @ 13 ... 19 => println!("I'm a teen of age {:?}", n),
    n             => println!("I'm an old person of age {:?}", n),
}
}

convert a number to a string?

The std::string::ToString is automatically implemented for any type which implements the Display trait. This includes all machine (including number) types.


#![allow(unused)]
fn main() {
let i = 5;
let five = i.to_string();
assert_eq!(five, "5");
}

convert a string to a number?


#![allow(unused)]
fn main() {
let num = "10".parse::<i32>().unwrap();
assert_eq!(10, num);
}

combine two strings?


#![allow(unused)]
fn main() {
fn greet(name: &str) {
    function_with_str_arg(&format!("Hello, {}!", name));
}
}

create a buffered reader from a File or other unbuffered type that implements Read?

To create a buffered reader for a File, do this:


#![allow(unused)]
fn main() {
BufReader::new(reader)
}

If you need to also set the size of the buffer, use this instead:


#![allow(unused)]
fn main() {
BufReader::with_capacity(size, reader)
}

create a buffered writer from a File or other unbuffered type that implements Write?

To create a buffered writer for a File, do this:


#![allow(unused)]
fn main() {
BufWriter::new(file)
}

If you need to also set the size of the buffer, use this instead:


#![allow(unused)]
fn main() {
BufWriter::with_capacity(size, writer)
}

convert an iterator over Result<T> into an iterator over T?

Assume we're reading lines from a reader and want to collect the lines into a vector of strings. We can do so like this, which will create a value of type Vec<T>:


#![allow(unused)]
fn main() {
let lines = reader.lines().collect::<io::Result<Vec<String>>>()?;
}

define a generic function whose argument is any filename type?

All three string types implement a common trait, AsRef<Path>, which makes it easy to declare a generic function that accepts "any filename type":


#![allow(unused)]
fn main() {
use std::path::Path;
use std::io;

fn open_file<P>(path_arg: P)
    -> io::Result<()>
    where P: AsRef<Path>
{
    let path = path_arg.as_ref();
    // ...
}
}

list the contents of a directory?


#![allow(unused)]
fn main() {
for entry_result in path.read_dir()? {
    let entry = entry_result?;
    println!("{}", entry.file_name().to_string_lossy());
}
}

Using type parameter... as runtime code?

pub trait DeviceCommunicationManagerCreator: Send {
  fn new(sender: Sender<DeviceCommunicationEvent>) -> Self;
}

fn add_comm_manager<T>(&self) -> Result<(), ButtplugServerStartupError>
  where
    T: 'static + DeviceCommunicationManager + DeviceCommunicationManagerCreator,
  {
    let mgr = T::new(self.sender.clone());
    ...
  }

Notes taken from the OG of Rust learning resources.

Chapter 16: Fearless Concurrency

Using Threads to Run Code Simultaneously

Waiting for all threads to finish

Spawning a thread returns a JoinHandle, which is an owned value that exposes the join method. When called, join blocks the calling (current) thread until the handled thread completes.

Using move Closures with threads

A move closure allows data to be used from one thread in another thread. It moves ownership of the data used to the thread's environment.

Using Message Passing to Transfer Data Between Threads

Rust's major abstraction for accomplishing message-sending concurrency is the channel.

A channel has two halves: a transmitter and a receiver. Rust's implementation allows for multiple producers and a single receiver/consumer, hence mpsc.

A channel is said to be closed if either half (sender or receiver) of a channel is dropped.

In the below code, we'll spawn a new thread that says hello to the main thread:


#![allow(unused)]
fn main() {
use std::thread;
use std::sync::mpsc;
use std::time::Duration;
let (tx, rx) = mpsc::channel();

thread::spawn(move || {
  let value = String::from("hello");
  tx.send(value).unwrap();
});

let received = rx.recv().unwrap();
println!("Transmitter said {}", received);
}

Some things to note from the example:

  • tx.send() returns a Result because it's possible that the receiving end has already been dropped.
  • tx.send(value) steals ownership! value will no longer be usable.
  • Once a sender thread has finished, calling recv() in the main thread will return an Error result, which indicates that no more values will be coming down from the receiver.

Tip: Instead of recv, you can also use try_recv, which does not block the receiving thread.

Sending Multiple Values (Proof of Concurrency!)

Tweaking the example above, the sender will send multiple values:


#![allow(unused)]
fn main() {
use std::thread;
use std::sync::mpsc;
use std::time::Duration;
let (tx, rx) = mpsc::channel();

thread::spawn(move || {
  let vals = vec![
    String::from("hello"),
    String::from("from"),
    String::from("the"),
    String::from("thread"),
  ];

  for val in vals {
    tx.send(val).unwrap();
    thread::sleep(Duration::from_millis(100));
  }
});

// A receiver is an Iterator!
for recvd in rx {
  println!("Sender said {}", recvd);
}
}

Sending Multiple Values from Multiple Transmitters

Senders can be cloned such that we can listen to messages from multiple threads:


#![allow(unused)]
fn main() {
use std::thread;
use std::sync::mpsc;
use std::time::Duration;
let (tx1, rx) = mpsc::channel();
let tx2 = mpsc::Sender::clone(&tx1);

thread::spawn(move || {
  let vals = vec![
    String::from("Thread 1: hello"),
    String::from("Thread 1: from"),
    String::from("Thread 1: thread"),
    String::from("Thread 1: UNO"),
  ];

  for val in vals {
    tx1.send(val).unwrap();
    thread::sleep(Duration::from_millis(100));
  }
});

thread::spawn(move || {
  let vals = vec![
    String::from("Thread 2: hello"),
    String::from("Thread 2: from"),
    String::from("Thread 2: thread"),
    String::from("Thread 2: DOS"),
  ];

  for val in vals {
    tx2.send(val).unwrap();
    thread::sleep(Duration::from_millis(50));
  }
});

// A receiver is an Iterator!
for recvd in rx {
  println!("Sender said {}", recvd);
}
}

Shared-State Concurrency

Above we saw concurrency via message communication. Now we'll look at concurrency via shared memory.

Using Mutexes to Allow Access to Data from One Thread at a Time

Concept: A mutex is a mechanism that allows only one thread to access data at any given time. To access data, a thread has to signal that it wants the mutex's lock. When it's done, it have to give the lock back to allow other threads to access the data.

Warning: A mutex cannot protect you from deadlocks! A deadlock occurs when an operation needs to lock two resources and two threads have each acquired one of the locks, causing them to wait for each other forever.

In the below example, we'll use a super simple mutex and comment the different aspects of its use:


#![allow(unused)]
fn main() {
use std::sync::Mutex;
let m = Mutex::new(5);

// We'll wrap this in an inner scope so that the lock will be dropped,
// allowing others to use it
{
  let mut val = m
    // Get the lock. NOTE: This method blocks!
    .lock()
    // In rust, the Result returned by lock contains the actual data,
    // wrapped in a MutexGuard
    .unwrap();

  // The data itself is a smart pointer!
  *val += 1;
}

println!("m = {:?}", m);
}

Concept: A call to lock() will fail if another thread hold the lock has panicked. Once this happens, the mutex will never be free. When a mutex is in such a state, it's said that the mutex is poisoned.

Sharing a Mutex Between Multiple Threads

In this example, we share a number behind a mutex among 10 threads. Each thread will increment the number.

use std::sync::{Mutex, Arc};
use std::thread;

fn main() {
  let counter = Arc::new(Mutex::new(0));
  let mut handles = vec![];

  for i in 0..10 {
    let counter = Arc::clone(&counter);
    handles.push(thread::spawn(move || {
      println!("Handle {} is running!", i);
      let mut val = counter.lock().unwrap();
      *val += 1;
    }));
  }

  handles.into_iter().for_each(|h| h.join().unwrap());

  println!("Result: {}", counter.lock().unwrap());
}

Notes taken from the book "Microservices with Rust".

3 - Logging & Configuring Microservices

Almost the entire ecosystem of loggin in rust is based on the log crate.

Hint: For logging something that requires an otherwise-expensive operation, wrap it using the log_enabled!(<LogLevel>) macro.

_Come back to this chapter: page 54

Come back to this chapter. Especially as a reference!

Go look at the workspace member /fun-with-futures.

5 - Understanding Asynchronous Operations with Futures

Warning: The book uses the term reactor, which is now referred to as an executor in the modern futures crates.

Pattern Reactor + Promises: A reactor allows a developer to run multiple activities in the same thread, while a promise represents a delayed result that will be available later. A reactors keeps a set of promises and continues to poll until it is completed and the result is returned.

The Basic Types of futures

  1. Future
  2. Stream
  3. Sink

Background Tasks and Thread Pools in Microservices

(skipping earlier sections of the chapter)

Actix

The main types & traits of actix:

  • System Type: Maintains the actors system. Must be created before any other actors are spawned. System is itself an actor.
  • Actor Trait: Anything that implements Actor can be spawned.
  • Arbiter Type: An Arbiter is an event loop controller. Can only have one per thread.
  • Context Type: Every Actor works in a Context, which, to the runtime, represents an actor's environment. Can be used to spawn other tasks.
  • Address Type: Every spawned Actor has an Address, which can be used to send things to and from a targeted actor.
  • Message Trait: Types that implement Message can be sent thorugh a type that implements Address's send method. Message has an associated type, Result, which is the type of value that will be returned after the message is processed.
  • Handler Trait: Implemented on Actors and enables/facilitates the actor' message-handling functionality.

11 - Involving Concurrency with Actors and the Actix Crate

Notes taken from the official Rust async book.

Why Async?

Pros

Asynchronous allows concurrent operation on the same thread. Multi-threaded code requires a lot of overhead and resources, even with minimal implementations.

Cons

Threads are natively supported and managed by the operating system, whereas async code is a language-specific implementation. Using async code also involves more complexity.

async/.await Primer

async trnasforms a block of code into a state machine that implements the Future trait. A blocked Future will yield control of the thread.

Concept: Async code can only run via the use of an executor. Invoking an async function will do nothing if its Future is not given to an executor like block_on.

block_on is the simplest executor. Others have more complex behavior, like scheduling multiple futures. Note it does block the current thread.

use futures::executor::block_on;

async fn hello_async() {
  println!("Hello, async!");
}

fn main() {
  let future = hello_async(); // this will do nothing but return the Future
  block_on(future);
}

Under the Hood

The Future Trait

A Future represents an asynchronous computation that can produce a value, with the poll function being at the heart of its mechanics. The poll function drives the future as far towards completion as possible.

A simplified version might look like this:


#![allow(unused)]
fn main() {
trait SimpleFuture {
  type Output;
  fn poll(&mut self, wake: fn()) -> Poll<Self::Output>;
}

enum Poll<T> {
  Ready(T), // Returned when the SimpleFuture has completed
  Pending, // Otherwise this
}
}

If poll returns Pending, it arranges for the wake function to be called when the Future is ready to make more progress. When wake is called, the executor driving the Future will call poll again so that the Future can make moar progress.

The purpose of the wake callback is to tell the executor when a future can make progress. Without it, the exeuctor would have to be constantly polling.

The real Future trait is slightly different:


#![allow(unused)]
fn main() {
trait Future {
  type Output;
  fn poll(
    self: Pin<&mut Self>, // Stuck here forever
    cx: &mut Context<'_>,
  ) -> Poll<Self::Output>;
}
}

There are two key differences:

  • The future is Pin'd.
  • The wake function pointer is now Context. Using just a function pointer as before means we couldn't tell an executor which Future called wake. Context fixes that by providing access to a Waker, which wakes a specific task.

Pinned objects can store pointers to their own fields.

Task Wakeups with Waker

Its the role of a Waker to tell an executor that its future is ready to make more progress, via the wake function that it provides.

When wake is called, the task executor knows to poll the future again at the next available opportunity.

Wakers implement Clone and can be copied around and stored.

Build a Timer

To get started, we'll need these imports:


#![allow(unused)]
fn main() {
use std::{
  future::Future,
  pin::Pin,
  sync::{Arc, Mutex},
  task::{Context, Poll, Waker},
  thread,
  time::Duration,
};
}

We start by just defining the future type, which needs a way for the thread to communicate that the timer has elapsed and the future should complete, for which we'll use a shared Arc<Mutex<..>>.


#![allow(unused)]
fn main() {
pub struct TimerFuture {
  shared_state: Arc<Mutex<SharedState>>,
  // ^ the arc + mutex enables communication between thread and future
}

// This is the state shared by the waiting thread and future
struct SharedState {
  // Whether the sleep time has elapsed
  completed: bool,

  // This is the waker for the task that `TimerFuture` is running on.
  // The thread can use this after setting `completed = true` to tell
  // `TimerFuture`'s task to wake up and move forward.
  waker: Option<Waker>,
}
}

Now the implementation:


#![allow(unused)]
fn main() {
impl Future for TimerFuture {
  type Output = ();

  fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
    // Check the state to see if we've already completed
    let mut shared_state = self.shared_state.lock().unwrap();

    if shared_state.completed {
      Poll::Ready(())
    } else {
      // Set waker so that the thread can wake up the current task when the
      // timer has completed, ensuring that the future is polled again and sees
      // that completed is true.
      //
      // We have to set the waker of shared_state on each poll because TimerFuture
      // can move between tasks on the executor
      // TODO: Figure out what that sentence actually means
      shared_state.waker = Some(cx.waker().clone());
      Poll::Pending
    }
  }
}
}

And finally the API for constructing a TimerFuture and starting the thread:


#![allow(unused)]
fn main() {
// Now we actually implement the timer thread
impl TimerFuture {
  pub fn new(duration: Duration) -> Self {
    let shared_state = Arc::new(Mutex::new(SharedState {
      completed: false,
      waker: None,
    }));

    // Spawn it
    let thread_shared_state = shared_state.clone();
    thread::spawn(move || {
      thread::sleep(duration);
      let mut shared_state = thread_shared_state.lock().unwrap();
      // Signal that the timer has finished and wake up the last task
      // on which the future was polled, if there is one.
      // Remember, the `shared_state.waker` is being set inside the `poll`
      // function of `TimerFuture`.
      shared_state.completed = true;
      if let Some(waker) = shared_state.waker.take() {
        waker.wake()
      }
    });

    TimerFuture { shared_state }
  }
}
}

Figure out what "However, the TimerFuture can move between tasks on the executor, which could cause a stale waker pointing to the wrong task" actually means.

Applied: Build an Executor

Concept: A future executor takes a set of top-level Futures and runs them to completion by calling their poll functions whenever the future is able to make progress.

Term: A task is just a future that can reschedule itself, usually paired with a sender so that it can requeue itself in the executor.

The process looks a bit like this:

  • Executor sends tasks that needs to be run over a channel
  • An executor will poll its futures once to get things started
  • A task will then call wake(), which schedules itself to be polled again by putting itself back onto the chanenl
  • The executor puts the woken-up future onto a queue, and poll is called again

In this process, the the executor itself only needs the receiving end of the task channel. The user of the executor will get a sending end so that new futures can be spawned.

Let's create an executor for our timer. We'll need to use the ArcWake trait, which provides an easy way to construct a Waker. These are the imports we'll need, in addition to those we used with the timer future implementation section:


#![allow(unused)]
fn main() {
use {
  futures::{
    future::{BoxFuture},
    task::{waker_ref, ArcWake},
  },
  std::sync::mpsc::{sync_channel, Receiver, SyncSender},
};
}

The executor will work by sending tasks to run over a channel. It'll pull events off of the channel and run them.


#![allow(unused)]
fn main() {
/// Task executor that receives tasks from a channel and runs them
struct Executor {
  ready_queue: Receiver<Arc<Task>>,
}

/// This spawns new futures onto the task channel
#[derive(Clone)]
struct Spawner {
  task_sender: SyncSender<Arc<Task>>,
}

/// A Task is a future that can reschedule itself to be polled by an Executor
struct Task {
  // Contains and in-progress future that needs to be pushed to completion.
  // The `Mutex` is here to prove to Rust that this is thread-safe.
  future: Mutex<Option<BoxFuture<'static, ()>>>,

  // Handle to place the task itself back onto the task queue.
  task_sender: SyncSender<Arc<Task>>,
}

fn new_executor_and_spawner() -> (Executor, Spawner) {
  let (task_sender, ready_queue) = sync_channel(10_000);
  (Executor { ready_queue }, Spawner { task_sender })
}
}

Let's also create add a method to Spawner that makes it easy to spawn new futures.


#![allow(unused)]
fn main() {
impl Spawner {
  fn spawn(&self, future: impl Future<Output = ()> + 'static + Send) {
    let task = Arc::new(Task {
      future: Mutex::new(Some(Box::pin(future))),
      task_sender: self.task_sender.clone(),
    });

    self
      .task_sender
      .send(task)
      .expect("Too many tasks are queued!");
  }
}
}

Now we need to implement a Waker (using ArcWake) for our Task, which will be responsible for scheduling a task to be polled again after wake is called.

Remember: Wakers have to specify which task has become ready.


#![allow(unused)]
fn main() {
impl ArcWake for Task {
  fn wake_by_ref(arc_self: &Arc<Self>) {
    // Implement `wake` by sending this task back onto the task channel
    // so that it'll be polled again by the executor.
    let cloned = arc_self.clone();
    arc_self
      .task_sender
      .send(cloned)
      .expect("Too many tasks are queued!");
  }
}
}

So now, when we create a waker, calling wake on it will send a copy of the Arc to be sent into the task channel.

Last step is to tell our Executor how to pick up the task and poll it.


#![allow(unused)]
fn main() {
impl Executor {
  fn run(&self) {
    while let Ok(task) = self.ready_queue.recv() {
      // Take the future, and if it has not completed yet (is still Some),
      // poll it in an attempt to complete it.
      let mut future_slot = task.future.lock().unwrap();
      if let Some(mut future) = future_slot.take() {
        // Create a `LocalWaker` from the task itself
        let waker = waker_ref(&task);
        let context = &mut Context::from_waker(&*waker);
        if let Poll::Pending = future.as_mut().poll(context) {
          // This future isn't done yet, so put it back in its task to be run again later
          *future_slot = Some(future);
        }
      }
    }
  }
}
}

FINALLY, we can run it:

fn main() {
  let (executor, spawner) = new_executor_and_spawner();
  // Spawn a task to print before and after waiting on a timer
  spawner.spawn(async {
    println!("Wait for itttt....");
    TimerFuture::new(Duration::new(2, 0)).await;
    println!("NOW!");
  });

  // Drop the spawner so that our executor knows it is finished
  // and won't receive anymore tasks to run.
  drop(spawner);

  // Run the executor until the task queue is empty
  executor.run();
}

async/.await

There are only two ways to use async.

  1. async functions
  2. async blocks

Both means return a value that implements the Future trait. The following functions return the same type:


#![allow(unused)]
fn main() {
async fn get5() -> u8 {
  5
}

fn get5() -> impl Future<Output = u8> {
  async { 5 }
}
}

async Lifetimes

Unlike regular functions, async functions whose parameters are references or non-'static return a Future which is bounded by the lifetime of the arguments. Meaning, the future returned from an async fn must be .await while its non-'static arguments are still valid.

async move

async move works just like move blocks used with closures.

.awaiting on a Multithreaded Executor

Futures can move freely between threads, so any value used in async stuff must be of a type that is also able to travel between threads (i.e. the type must implement Send).

Pinning

Why Pinning

Pin works in tandem with its BFF, Unpin.

Concept: Pinning makes it possible to guarantee than an object which implements !Unpin won't ever be moved.

See the Pinning block of the hidden code page for a deeper explanation.

Streams

The Stream Trait

The Stream trait is basically the love-child of Future and Iterator:


#![allow(unused)]
fn main() {
trait Stream {
  // Yielded type
  type Item;

  // Attempt to resolve the next item in the stream.
  fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>)
    -> Poll<Option<Self::Item>>;
}
}

Iteration and Concurrency

Book Notes

These pages correspond to the chapters of notes I've taken while reading this book. So far it's been an excellent resource and I'd recommend it to anyone learning Rust.

Output

Basic Types

Integer Types

u8 is used to represent single-byte values.

Characters are distinct from the numeric types (unlike C++); a char is neither a u8, nor an i8.

Values used as array access indices must be usize. The same applies to values that represent the size of arrays or vectors.

Integer literals can take a suffix indicating their type. The suffix can optionally be seperated by an underscore. eg:

  • 42u8 is a u8 value
  • 1729isize and 1729_isize are both isize

Compiler behavior: When infering a numeric type, the compiler will tend to favor inferring i32.

The following prefixes can be used with numeric literals to specify their radix:

  • 0x hexadecimal
  • 0o octal
  • 0b binary

Long numeric literals may be segmented by underscores for readability, eg: 4_295_923_000_010 or 0xffff_0f0f.

Rust provides byte literals, which are character-like literals for u8 values: b'X' represents the ASCII code for the character X, but as a u8 value.

You can convert from one integer type to another using the as (type-cast) operator: 65535_u16 as i32

Floating-Point Types

The fraction part of a floating-point type may consist of a lone decimal point: 5. is a valid float constant.

Compiler behavior: Given a floating-point number, the compiler will infer a type of f64.

The bool Type

bool values can be converted to i## types using the as operator:


#![allow(unused)]
fn main() {
assert_eq!(false as i32, 0);
assert_eq!(true as i32, 1);
}

But, the inverse is not true. The as operator can't convert numeric types to bool. You have to be more explicit by using a comparison: x != 0

Rust uses an entire byte for a bool value in memory, so you can create a pointer to it.

Characters

Rust's character type char represents a single Unicode character, as a 32-bit value.

chars represent a single character in isolation. Whereas strings and streams of text use UTF-8 encoded bytes. This means the String type represents a sequence of UTF-8 bytes, not chars.

A char literal is just a single Unicode character wrapped in single quotes, e.g. '©'.

The as operator can be used to convert char to an integer type (i32, u16, etc), but the opposite is only true for u8 types. For others, use std::char::from_(integer type).

Tuples

Tuple elements cannot be accessed using dynamic indices. That is to say, given tuple t, I can't use variable i to access the ith element.

Term: The type definition () is called the unit type.

Rust uses the unit type where there's no meaningful value to carry, but the context still demands us to define a type. e.g. a function that returns no value has a return type of ().

Shorthand: A function declaration whose return type is ommited is shorthand for returning the unit type. e.g. fn my_fn(); is shorthand for fn my_fn() -> ();.

Trailing commas are acceptable in tuples. They're acceptable pretty much anywhere in Rust.

Pointer Types

Pointers in Rust are much more performant and memory-efficient than they are in GCed languages.

References

& is the immutable reference operator. It creates the reference.

&mut is the mutable reference operator.

* is the dereference operator. It accesses the value being referred to.

The type &T is pronounced "ref T", meaning "reference to a value of type T".

The expression &x creates a reference to value x. In words, we'd say that it "borrows a references to x".

The expression *x (given that x is of type &T) refers to the value that x is a reference to.

References are immutable by default. For a reference to be mutable, it must have type &mut T.

Pointers in Rust can never be null. There are no pointer exceptions.

Boxes

Boxs are references whose referent is allocated directly in the heap.

When a Box is created, enough memory is allocated on the heap to contain its value:


#![allow(unused)]
fn main() {
let v = vec![1, 2, 3, 4];
let b = Box::new(v); // allocated space on the heap to hold v
}

When a Box reference goes out of scope, both itself and the value it refers to in the heap are freed.

Raw Pointers

Raw pointers are only used in unsafe code.


Arrays, Vectors, and Slices

Rust has 3 types for representing a sequence of values.

NameTypeDescriptionSizeMemory
Array[T; N]Array of N values, each of type TFixedStack
VectorVec<T>Vector of TsDynamicHeap
Slice&[T]Shared slice of TsFixedStack (as pointer to heap value)

Given any of the above types as value v, the expression v.len() gives the number of elements in v, and v[i] refers to the i'th element of v. i must be of type usize; no other integer types will work as an index.

Arrays

An array's length is built into its type and is fixed at compile time.

Implicit behavior: When working with an array value and accessing its methods, Rust implicitly converts a reference to an array to a slice. So if you need to know the methods for an array, go look at the methods for slices.

Vectors

A vector is allocated on the heap.

There are 5 main ways to create a vector:

  1. Use the vec! macro (simplest)
  2. Build a vector by repeating a given value a certain number of times using a syntax that imitates array literals:

#![allow(unused)]
fn main() {
let rows = 100;
let cols = 100;
let pixel_buffer = vec![0; rows * cols];
println!("Buffer is {} bytes long.", pixel_buffer.len())
}
  1. Using Vec::new to create a new, empty vector, and pushing elements onto it.

#![allow(unused)]
fn main() {
let mut v = Vec::new();
v.push("hello");
v.push("vector");
println!("{:?}", v);
println!("capacity: {}", v.capacity());
}
  1. Iterators produce vectors when executed (using their .collect() method):

#![allow(unused)]
fn main() {
let v: Vec<i32> = (1..4).collect();
assert_eq!(v, [1, 2, 3]);
}
  1. If you know the size of the vector in advance, you can use Vec::with_capacity to create the vector, instead of new:

#![allow(unused)]
fn main() {
let mut v = Vec::with_capacity();
v.push("hello");
v.push("vector");
println!("{:?}", v);
}

Using Vec::with_capacity instead of Vec::new is more performant because it can prevent costly heap reallocations when a vector grows beyond its current capacity.

A vector's capacity() method returns the number of elements the vector could hold without reallocation.


#![allow(unused)]
fn main() {
// Track the length and capacity of a vector as values are added to it
let mut v: Vec<i32> = Vec::with_capacity(2);
println!("length/capacity: {}/{}", v.len(), v.capacity());
v.push(1);
v.push(2);
println!("length/capacity: {}/{}", v.len(), v.capacity());
v.push(3);
println!("length/capacity: {}/{}", v.len(), v.capacity());
}

As with arrays, slice methods can be used on vectors.

In stack memory, a Vec<T> consists of three values:

Stack cellStack cellStack cell
Pointer to heap-allocated bufferThe capacity of the bufferThe current occupied size of the buffer

Inserting and removing vectors vectors from anywhere but the end of a vector is expensive.

Slices

A slice, written [T] (without specifying the length), is a region of an array or vector.

Since a slice can be any length, they can't be stored directly in variables or passed as function arguments; they are always passed by reference.

A reference to a slice is a fat pointer.

Term: A fat pointer is a two-word value on the stack comprised of

  1. A pointer to the slice's first element
  2. The number of elements in the slice

Whereas an ordinary reference is a non-owning pointer to a single value, a reference to a slice is a non-owning pointer to several values.

A slice is (maybe?) a psuedo-generic for any sequential data type.

You can get a reference to a slice of an array, vector, or another slice by indexing it with a range:


#![allow(unused)]
fn main() {
let v: Vec<f64> = vec![1., 2., 3.];
// println!()
}

The term slice is often used for reference types like &[T] or &str, but that's just shorthand. Those types are called references to slices.

String Types

String Literals

String literals are enclosed in double quotes.

Term: Rust offers raw strings that don't require backslashes or explicit inclusion of whitespace. They're similar to template string in Javascript.


#![allow(unused)]
fn main() {
let paragraph = r#"
I'm just a regular paragraph
with the appropriate spacing.
"#;
println!("{}", paragraph);
}

Byte Strings

A string literal with the b prefix is a byte string. A byte string is a slice of u8 values (rather than Unicode text).

Strings in Memory

Rust strings are stored in memory using UTF-8 (not as arrays of chars).

A String is stored on the heap as a resizable buffer of UTF-8 text. You can think of a String as a Vec<u8> that is guaranteed to hold well-formed UTF-8.

Pronounciation: A &str is called a "stir" or "string slice".

A &str is a reference to a sequence of UTF-8 text owned by someone else.

A &str is a slice, so it is therefore a fat pointer. You can think of a &str as being nothing more than a &[u8] that is guaranteed to hold well-formed UTF-8.

A string literal is a &str that refers to preallocated text stored in a read-only memory.

Any string type's length (returned by .len()) is measured in bytes, not characters.

It is impossible to modify a &str:


#![allow(unused)]
fn main() {
let mut s = "hello";
s[0] = 'c'; // &strs cannot be mutably indexed
}

String

Ways to create a String:

  • Given a &str, the .to_string() method will copy it into a String.
  • The format!() macro works just like println!(), except that it returns a new String instead of writing text to stdout, nor does it automatically add a newline at the end.
  • Arrays, slices, and vectors of strings have two methods that form a new String from many strings:
    1. .concat()
    2. .join(sep)

#![allow(unused)]
fn main() {
let elves = vec!["snap", "crackle", "pop"];
println!("{:?}", elves.concat());
println!("{:?}", elves.join(", "));
}

A &str can refer to both a string literal or a String, so it's the most appropriate for function arguments when the caller should be allowed to pass either kind of string.

Unlike other languages, Rust strings are strictly Unicode only. This means that they're not always the appropriate choice for string-like data. Here are some situations where they're not the correct choice:

When you haveUse
Unicode textString or &str
Filenamestd::path::PathBuf and &Path
Binary dataVec<u8> and &[u8]
Environment variablesOsString and &OsStr
Strings from a FFIstd::ffi::CString and &CStr

Ownership

In Rust, every value has a single owner that determines its lifetime. When the owner is freed--aka dropped--the owner of the value is dropped too.

A variables owns its value. When control leaves the block in which the variable is declared, the variable is dropped.

Owners and their owned values form trees. Every value in a Rust program is a member of some tree, rooted in some variable.

In their purest form, Rust's ownership model is too rigid to be usable. But, the language provides several mechanisms to make it work:

  • Values can be moved from one owner to another
  • The std library provides reference-counted pointer types--Rc and Arc--which allows a value to have multiple owners, with some restrictions
  • References can be borrowe from values; references are non-owning pointers with limited lifetimes

Moves

For values of most types, operations like assignment to variables, passing to functions, or returning from functions don't copy values: they move it.

In Rust, assignments of most types move the value from the source to the destination, leaving the source uninitialized.


#![allow(unused)]
fn main() {
let name_1 = vec!["alex", "eden"];
let name_2 = name_1;        // moves name_1's heap memory to name_2
println!("{:?}", name_1);   // fails - name_1 has become uninitialized
}

For the above to work, we have to explicity ask for copies of the values using .clone(), which is built into most types.


#![allow(unused)]
fn main() {
let name_1 = vec!["alex", "eden"];
let name_2 = name_1.clone();
println!("{:?}", name_1);
}

More Operations that Move

If you move a value into a variable (via assignment) that was already initialized, Rust drops the variable's prior value.

Passing arguments to functions moves ownership to the function's parameters; returning a value from a function move ownership to the caller.

Building a tuple moves ownership of the values into the tuple structure itself. The same applies to other complex types.

Keep in mind that transfer of ownership does not imply a change in the owned heap storage. Moves apply to the value proper, which for types like vectors and strings, are the three-word header stored on the stack that represents the variable.

Moves and Control Flow

As a general principle, if it's possible for a variable to have had its value moved away, and it hasn't definitely been given a new value since, it's considered uninitialized.

Moves and Indexed Content

Copy Types: The Exception to Moves

In general, most types are moved. The exception are types that implement the Copy trait. In these cases, the value is copied, rather than moved. This applies to all types of moves, including passing Copy types to functions and constructors.

The standard Copy types include all the machine integers and floating-point numeric types, the char and bool types, and a few others. A tuple or fixed-size array of Copy types is itself a Copy type.

As a rule of thumb, any type that needs to do something special when a value is dropped cannot Copy. Vectors, files, mutexes, etc. cannot be Copy typed.

By default, struct and enum types are not Copy, but they can be, if their fields are themselves Copy.

To make a type Copy, add the attribute #[derive(Copy, Clone)] above its definition.

In Rust, every move is a byte-for-byte, shallow copy that leaves the source uninitialized. Copies are the same, except that the source remains initialized.

Rc and Arc: Shared Ownership

Arc stands for atomic reference count. Rc stands for reference count.

The difference between Arc and Rc is that an Arc is safe to share between threads directly, whereas a Rc uses faster non-thread-safe code to update its reference count.

If you don't need to share pointers between threads, use Rc, rather than suffer the performance penalty of using Arc.

For any type T, an Rc<T> value is a pointer to a heap-allocated T that has had a reference count affixed to it. Cloning an Rc<T> value does not copy the T; rather, is creates another pointer to it and increments the reference count.

A value owned by an Rc pointer is immutable.

The main risk with using Rc pointers to manage memory is that if there are ever two Rc values to point to each other, each will keep the other's reference count always above 0, and neither will ever be freed. This is called a reference cycle.

Example of a reference cycle

The workaround for avoiding reference cycles is using a language mechanism called interior mutability.

References

Pointers can be categorized into two types:

  1. Owning
  2. Nonowning

With owning pointers (Box<T>s, Vecs, Strings, etc), when the owner is dropped, the referent goes with it.

Nonowning pointers on the other hand have no effect on their referents' lifetimes.

Terminology: Nonowning pointer types are called references.

References must never outlive their referents. Rust refers to creating a reference to some value as borrowing the value: what gets borrowed, must eventually be returned to the owner.

References let you access values without affecting their ownership.

There are two kinds of references:

  1. Shared &T
  2. Mutable &mut T

A shared reference lets you read but not modify its referent. There is no limit to the number of shared references that can refer to the same value. Shared references are Copy type.

A mutable reference let you read and modify its referent. But, if a value is the referent of a mutable reference, you may not have any other references of any sort to the value active at the same time. Mutable reference are not Copy.

The distiction between can be thought of as a multiple readers v. single writer rule.

When one or more shared references to a value exist, not even its owner can modify it. The value is locked.

When a mutable reference to a value exists, only the reference itself may access it; not even its owner.

Concept: When a value is passed to a function in way that moves ownership of the value to the function, we say that it's passed by value. When a function is passed a reference to a value, we say that it's passed by reference.

References as Values

Rust References vs. C++ References

Implicit behavior: Since references are so widely used in Rust, the . operator implicitly dereferences its left operand, if needed.

Shorthand: Provided the above implicit behavior, for a reference named some_ref of type &T, where T has a field named x, the following two statements are equivalent:

  • some_ref.x
  • (*some_ref).x

Implicit behavior: The . operator will also implicitly borrow a reference to its left operand, if needed for a method call.

Shorthand: Provided the above implicit behavior, given a mutable value named v of type Vec<u64>, the following two calls to Vec's sort method are equivalent:

  • v.sort()
  • (&mut v).sort()

Assigning References

Assigning to a Rust reference makes it point at a new value:


#![allow(unused)]
fn main() {
let x = 10;
let y = 20;
let mut r = &x;
println!("r equals {}", *r);
r = &y; // assign to r
println!("r equals {}", *r);
}

References to References

Rust allows references to references, and the . operator follows as many references it needs to find the target value:


#![allow(unused)]
fn main() {
struct Point { x: usize, y: usize }
let point = Point { x: 1000, y: 750 };
let r: &Point = &point;
let rr: &&Point = &r;
let rrr: &&&Point = &rr;
println!("x equals {}", rrr.x);
}

Comparing References

Much like the . operator, Rust's comparison operators will also "see through" any number of references as are necessary, as long as both operands have the same type.

If you actually want to know whether to references point to the same address in memory, use std::ptr::eq, which compares the references as addresses.


#![allow(unused)]
fn main() {
let x = 10;
let y = 10;
let rx = &x;
let ry = &y;
let rrx = &rx;
let rry = &ry;
println!("rrx and rry are equal? {}", rrx == rry);
println!("addresses are equal? {}", std::ptr::eq(rrx, rry));
}

References are Never Null

In Rust, if you need a value that is either a reference to something or not, use the type Option<&T>.

At the machine level, Rust represents None as a null pointer, and Some(r), where r is a &T value, as the nonzero address.

Borrowing References to Arbitrary Exceptions

References to Slices and Trait Objects

Term: A fat pointer is a two-word (2 * usize) value on the stack that carries the address of its referent, along with some further information necessary to to put the value ot use.

There are two kinds of fat pointers:

  1. Slice references
  2. Trait objects

A reference to a slice is a fat pointer:

  • 1st word: The starting address of the slice
  • 2nd word: The slice's length

Term: A trait object is a fat pointer referencing a value that implements a certain trait. A trait object carries:

  • 1st word: A value's address
  • 2nd word: A pointer to the trait's implementation appropriate to the pointed-to value for invoking the trait's methods

Reference Safety

The following sections pertain to Rust's reference rules and how it foils any attempt to break them.

Borrowing a Local Variable

Rust tries to assign each reference type in your program a lifetime that meets the contraints imposed by how it's used.

Term: A lifetime is some stretch of a program for which a reference could be safe to use; eg: a lexical block, a statement, an expression, the scope of some variable, etc.

Lifetimes are figments of Rust's imagination; they only exist as part of the compilation process and have no runtime representation.

Receiving References as Parameters

Term: A static is Rust's equivalent of a global (as is, lifetime, not visibility) variable. It's a value that's created when the program starts and lasts until the program terminates.

Some rules for statics (there are more):

  • Every static must be initialized at the time of declaration
  • Mutable statics are not thread-safe and may only be accessed within an unsafe {} block

Syntax: The following code is a general syntax for specifying a function parameter's lifetime:

fn f<'a>(p: &'a i32) { ... }

Here, we'd say that the lifetime 'a is a lifetime parameter of f. We can read <'a> as "for any lifetime 'a, so in the above expression, we're defining f as a function that takes a reference to an i32 with any given lifetime 'a.

Passing References as Arguments

You only need to worry about lifetime parameters when defining functions and types; when using them, Rust infers the lifetimes for you.

Returning References

Implicit behavior: When a function takes a single reference as an argument, and returns a single reference, Rust assumes that the two must have the same lifetime. This means that the following two expressions are equivalent:

fn smallest<'a>(v: &'a [i32]) -> &'a i32 { ... }

fn smallest(v: &[i32]) -> &i32 { ... }

Structs Containing References

Whenever a reference type appears inside another type's definition, you must write out its lifetime.

Given the above statement, we know that the following will fail to compile:


#![allow(unused)]
fn main() {
struct S {
    r: &i32
}

let x = 10;
let s = S { r: &x };
println!("{}", s.r);
}

The fix here is to provide the lifetime parameter of r in the definition of S:


#![allow(unused)]
fn main() {
struct S<'a> {
    r: &'a i32
}

let x = 10;
let s = S { r: &x };
println!("{}", s.r);
}

A type's lifetime parameters always reveal whether it contains references with interesting (aka, 'static) lifetimes, and what those lifetimes can be.

Distinct Lifetime Parameters

When defining a types or functions that have or receive multiple references, a distinct lifetime parameter should be defined for each.

// Types
struct S<'a, 'b> {
    x: &'a i32;
    y: &'b i32
}

// Functions
fn f<'a, 'b>(
    x: &'a i32,
    y: &'b i32,
) -> &'a i32 {
    r
}

Omitting Lifetime Parameters

Shorthand: If you function doesn't return any references (or other types that require lifetime parameters), then you never need to write out lifetimes for the parameters.


#![allow(unused)]
fn main() {
struct S<'a, 'b> {
    x: &'a i32,
    y: &'b i32
}

fn sum_r_xy(r: &i32, s: S) -> i32 { r + s.x + s.y }
}

The above is shorthand for:

fn sum_r_xy<'a, 'b, 'c>(r: &'a i32, s: S<'b, 'c>) -> i32 { ... }

Shorthand: If there's only a single lifetime that appears among your function's parameters, then Rust assumes any lifetimes in the return must be the one defined.


#![allow(unused)]
fn main() {
fn first_third(point: &[i32; 3]) -> (&i32, &i32) {
    (&point[0], &point[2])
}
}

The above is shorthand for:

fn first_third<'a>(point: &'a [i32; 3]) -> (&'a i32, &'a i32) { ... }

Shorthand: If your function is a method on some type and takes its self parameter by reference, Rust assumes that self's lifetime is the one to give any references in the return value.


#![allow(unused)]
fn main() {
struct StringTable {
    elements: Vec<String>,
}

impl StringTable {
    fn find_by_prefix(&self, prefix: &str) -> Option<&String> {
        for i in 0 .. self.elements.len() {
            if self.elements[i].starts_with(prefix) {
                return Some(&self.elements[i]);
            }
        }
        None
    }
}
}

The above method's signature is shorthand for:

fn find_by_prefix<'a, 'b>(&'a &self, prefix: &'b str) -> Option<&'a String>

Sharing vs. Mutation

Shared access

A value borrowed by shared references is read-only.

Across the lifetime of a shared reference, neither its referent, nor anything reachable from that referent, can be changed by anything.

Mutable access

A value borrowed by a mutable reference is reachable exclusively via that reference.

Across the lifetime of a mutable reference, there is no other usable path to its referent, or to any value reachable from there.

The only references whose lifetimes may overlap with a mutable reference are those you borrow from the mutable reference itself.

Expressions

Blocks and Semicolons

When you see expected type '()', look for a missing semicolon first.

Empty statements are allowed in blocks. They consist of a stray semicolon all by itself.

Declarations

Syntax: The simplest kind of declaration is a let declaration, which declares local variables:

let name: type  = expr;

The type and initializer are optional. The semicolon is required.

Term: An item declaration is a declaration that could appear globally in a program or module, such as a fn, struct, or use.

When a fn is declared inside a block, its scope is the entire block (no TDZ)--that is, it can be used through the enclosing block. But a nested fn cannot access local variables or arguments that happen to be in scope. (the alternative to nested function are closure, which do have access to enclosing scope).

if and match

Expressions used as conditions in if expressions must be of type bool.

An if expression with no else block behaves exactly as though it had an empty else block.

Syntax: The general form of a match expression is:

match value {
    pattern => expr,
    ...
}

if let

Syntax: The last if form is the if let expression:

if let pattern = expr {
    block1
} else {
    block2
}

It's never strictly necessary to use if let, because match can do everything if let can do.

Shorthand: An if let expression is shorthand for a match with just one pattern:

match expr {
    pattern => { block1 }
    _ =>  { block2 }
}

Loops

Loops are expressions in Rust, but they don't produce useful values.

The value of a loop is ().

Operator: The .. operator produces a range of type std::ops::Range. A range is a simple struct with two fields: start and end.

Ranges can be used with for loops because Range is an iterable type.

Term: An iterator type is a type that implements the std::iter::IntoIterator trait.

Iterating over a mut reference provides a mut reference to each element:


#![allow(unused)]
fn main() {
let mut strings: Vec<String> = vec![
    "what's".to_string(),
    "my".to_string(),
    "line?".to_string(),
];

for rs in &mut strings {
    rs.push('\n');
}

println!("{}", strings.join(""));
}

A loop can be labeled with a lifetime. In the below example, 'search: is a label for the outer for loop. Thus break 'search exits that loop, not the inner loop (breaks can also be used with continue):


#![allow(unused)]
fn main() {
'search:
for room in apartment {
    for spot in room.hiding_spots() {
        if spot.contains(keys) {
            println!("Your keys are {} in the {}.", spot, room);
            break 'search;
        }
    }
}
}

return Expressions

Shorthand: A return without a value is shorthand for return ().

Why Rust Has Loop

Expressions that don't finish normally are assigned the special type !, and they're exempt from the rules about types having to match.

Term: A function that never returns--that is, returns !--is called a divergent function.

An example of a divergent funtion is std::process::exit, which has the following type signature:

fn exit(code: i32) -> !;

Function and Method Calls

The difference between static and nonstatic methods is the same as in OO languages: nonstatic methods are called on values (like my_vec.len()), and static methods are called on types themselves (like Vec::new()).

It's considered good style to omit types whenever they can be inferred.

Fields and Elements

Fields of a struct are accessed using the familiar . operator. Tuples are the same, except that their fields have numbers rather than names.

Square brackets access the elements of an array, a slice, or a vector

Reference Operators

Operator: The unary * operator is used to access the referent of a reference.

The * operator is only necessary when we want to read or write the entire value that the reference points to.

Arithmetic, Bitwise, Comparison, and Logical Operators

Warning: Dividing an integer by zero trigger a panic, even in releases builds.

Integers have a method a.checked_div(b) that returns Option<I> and never panics. If ever there's the slightest possibility that an integer will be divided by zero, use checked_div.

There is no unary + operator.

Assignment

Syntax: Rust does not support chained assignment. a = b = c will not work.

Syntax: Rust does not have increment and decrement operators: ++ and --.

Type Casts

Casting an integer to another integer type is always well-defined. Converting to a narrow type results in truncation.

Casting a large floating-point value to an integer type that is too small to represent it can lead to undefined behavior. (this might no longer be true in newer versions of Rust).

Implicit behavior: Values of type &String auto-convert to type &str without a cast.

Implicit behavior: Values of type &Vec<i32> auto-convert to &[i32].

Implicit behavior: Values of type &Box<T> auto-convert to &T.

Term: The above implicit behaviors are called deref coercions, because they apply to types that implement the Deref built-in trait. The purpose of Deref coercion is to make smart pointer types, like Box, behave as much like the underlying value as possible.

Error Handling

There are two types of error-handling in Rust:

  1. panic
  2. Results

Ordinary errors are handled using Results.

Panic is the bad kind, it's for errors that should never happen.

Panic

Term: A program panics when it encounters something so messed up that there must be a bug in the program itself.

These are some things that can cause a panic:

  • Out-of-bounds array access
  • Integer division by zero
  • Calling .unwrap() on an Option that happens to be None
  • Assertion failure

Behavior: When a program panics, you can choose one of two ways that it'll be handled:

  1. Unwind the stack (this is the default)
  2. Abort the process

Unwinding

Process of a panic-triggered unwinding

  1. An error message is printed to the terminal.
  2. The stack is unwound.
  3. Any temporary values, local variables, or arguments that the current function was using are dropped, in the reverse of the order they were created. This perpetuates upwards through the unwound stack.
  4. The thread exits. If the panicking thread was the main thread, then the whole process exits (with a nonzero exit code).

A panic is not a crash, nor undefined behavior. A panic's behavior is well-defined and safe.

Behavior: Panics occur per thread. One thread can be panicking while other threads are going on about their normal business.

It's possible to catch stack unwinding, which would allow the thread to survive and continue running using the standard library function std::panic::catch_unwind().

Aborting

Stack unwinding is the default panic behavior, but

Behavior: There are two circumstances in which Rust does not try to unwind the stack:

  1. If a .drop() method triggers a second panic while Rust is still trying to clean up after the first. The process will be aborted.
  2. If you compile with a -C panic=abort flat, the first panic in the program immediately aborts the process. (this can be used to reduce compiled code size)

Result

Rust doesn't have exceptions.

Catching Errors

The most thorough way of dealing with errors via Result is using a match expression:


#![allow(unused)]
fn main() {
match get_weather(hometown) {
    Ok(result) => {
        display_weather(hometown, &report);
    }
    Error(err) => {
        println!("error querying the weather: {}", err);
        schedule_weather_retry();
    }
}
}

matches can be a bit verbose. But, Result comes with a ton of useful methods for more concise handling.

Add notes about the `Result` methods starting on page 148.

Result Type Aliases

Sometimes you'll see Rust documentation that seems to omit the error type of a Result. In such cases, a Result type alias (a type alias is a shorthand for type names) is being used:

fn remove_file(path: &Path) -> Result<()>

Printing Errors

All error types implement a common trait: std::error::Error.

Warning: Printing an error value does not also print out its cause. If you want to print all available information for an error, use the print_error function defined below.


#![allow(unused)]
fn main() {
use std::error::Error;
use std::io::{Write, stderr};
/// Dump an error message to `stderr`
/// If another error occurs in the process, ignore it
fn print_error(mut err: &Error) {
    let _ = writeln!(stderr(), "error: {}", err);
    while let Some(cause) = err.cause() {
        let _ = writeln!(stderr(), "caused by: {}", cause);
        err = cause;
    }
}
}

Crate: The standard library's error types do not include a stack trace, but the error-chain crate makes it easy to define your own custom error type that supports grabbing a stack trace when it's created. It uses the backtrace crate to capture the stack.

Propagating Errors

Operator: You can add a ? to any expression that produces a Result. The behavior of ? depends on the state (Ok or Error of the Result):

  • If Ok, it unwraps the Result to get the success value inside
  • If Error, it immediately returns from the enclosing function, passing the error result up the call chain (see the rule below)

Rule: The ? can only be used in functions whose return value is of type Result.

Working with Multiple Error Types

Some functions have the potential to return Errors of a many different type (depending on the operation that triggered the error).

There are several approaches to dealing with multiple error types:

  1. Conversion: Define a custom error type (say, CustomError) and implement conversions from io::Error to the custom error type.
  2. Box 'em up: The simpler approach is to use pointers. All error types can be converted to the type Box<std::error::Error>, which represents "any error", so we can define a set of generic type aliases all possible errors. This is the most idiomatic approach..

For generalizing all errors and results, define these type aliases:


#![allow(unused)]
fn main() {
type GenError = Box<std::error::Error>;
type GenResult<T> = Result<T, GenError>;
}

Tip: To convert any error to the GenError type, call GenError::from().

The downside of the GenError approach is that the return type no longer communicates precisely what kinds of errors the caller can expect.

Tip: If you want to handle on particular kind of error, but let all other propagate out, use the generic method error.downcast_ref::<ErrorType>(). This is called error downcasting.

Dealing with Errors That "Can't Happen"

Operator: Instead of the ? operator, which requires implementing error-handling, we can use the .unwrap() method of a Result to get the Ok value.

Warning: The difference between ? and .unwrap() is that if .unwrap() is used on Result that's in its Error state, the process will panic. In other words, only use .unwrap() when you're damn sure Result is Ok.

Ignoring Errors

Idiom: If we really don't care about the contents of a Result, we can use the following idiomatic statement to silence warnings about unused results:

let _ = writeln!(stderr(), "error: {}", err);

Handling Errors in main()

If you propagate an error long enough, eventually it'll reach the root main() function, at which point, it can no longer be ignored.

Info: The ? operator cannot be used in main because main's return type is not a Result. Instead, use .expect().

Behavior: Panicking in the main thread print an error message then exits with a nonzero exit code.

Crates and Modules

Crates

The easiest way to see what crates are and how the work is to use cargo build with the --verbose flat to build an existing project that has some dependencies.

When compiling libraries, Cargo uses the --crate-type lib option. This tells rustc not to look for a main() function but instead to produce a .rlib file containing compiled code in a form that later rustc commands can use as input.

When compiling a program, Cargo uses --crate-type bin, and the result is a binary executable for the target platform.

With each rustc command, Cargo passes --extern options giving the filename of each library the crate will use. This ties directly into the extern crate some_crate; statements in source code.

Command: The command cargo build --release will produce an optimized release build.

Qualities of a release build:

  • Run faster
  • Compile slower
  • Don't check for integer overflow
  • Skip debug_assert!() assertions
  • Less reliable and verbose stack traces

Build Profiles

The following CLI commands are used to select a rustc profile:

CommandCargo.toml section used
cargo build[profile.debug]
cargo build --release[profile.release]
cargo test[profile.test]

Behavior: If no profile is specified, [profile.debug] is selected by default.

Tip: To get the best data from a profiler, you need both optimizations and debug symbols to be enabled. To do so, add this to your cargo config:

[profile.release]
debug = true # enable debug symbols in release builds

Modules

Concept: Modules are Rust's namespaces. Whereas crates are about code sharing between projects, modules are about code organization within a project.

Term: A modules is a collection of items.

Behavior: Any module item not marked pub is private.

Behavior: Modules can be nested. It's common to see a module that's a collection of submodules:


#![allow(unused)]
fn main() {
mod life {
    pub mod animalia {
        pub mod mammalia {}
    }
    pub mod plantae {}
    pub mod fungi {}
    pub mod protista {
    pub mod archaea {}
    pub mod bacteria {}
}
}

It's generally advised not to keep all source code in a single massive file of nested modules. For obvious reasons.

Modules in Separate Files

Behavior: Writing a module inline like mod life; tells the compiler that the life module lives in a separate file called life.rs.

When you build a Rust crate, you're recompiling all of its modules, regardless of where those modules live.

Behavior: A module can have its own directory. When Rust sees mod life;, it checks for both life.rs and life/mod.rs.

Concept: A mod.rs file is exactly like a barrel (index.js) in JS/TS.

Paths and Imports

Operator: The :: operator is used to access the items of a module. e.g. life::animalia::mammalia::....

Paths to items can be either relative or absolute. An absolute path is prefixed by :: and can be used to access "global" items, e.g. ::std::mem::swap is an absolute path.

Concept: Accessing an absolute path is a lot like accessing the global object in JS, ie window, in a browser.

Operator: The use declaration creates aliases to modules and items through the enclosing block or module. e.g. use std::mem; create a local alias to ::std::mem's items.

It's generally considered best style to import types, traits, and modules, then use relative paths to access the items within them.

Several items from the same module can be imported at once, as can all items:


#![allow(unused)]
fn main() {
use std::collections::{HashMap, HashSet}; // import just two items
use std::io::prelude::*;                  // import all items
}

Operator: Modules do not automatically inherit items from their parent modules. The super keyword can be as an alias for the parent module, and self is an alias for the current module.

Submodules can access private items in their parent modules, but they have to import them by name. use super::*; will only import the pub items.

Modules aren't the same thing as files, but there some analogies between module paths and file paths:

Module pathFile pathDescription
self"."Accesses the current module
super".."Accesses the parent module
extern crateSimilar to mounting a filesystem

The Standard Prelude

Implicit Behavior: The standard library std is automatically linked with every project, as are some items from the standard prelude like Vec and Result. It's as though the following imports are invisibly added to all files:


#![allow(unused)]
fn main() {
extern crate std;
use std::prelude:v1::*;
}

Convention: Naming a module prelude tells users that it's meant to be imported using *.

Items: The Building Blocks of Rust

Items make up the composition of modules. The list of items is really a list of Rust's features as a language:

ItemsKeywords
Functionsfn
Typesstruct, enum, trait
Type aliasestype
Methodsimpl
Constantsconst, static
Modulesmod
Importsuse, extern crate
FFI blocksextern

Item: Types

User-defined types are introduced using struct, enum, and trait keywords.

A struct's field, even private fields, are accessible through the module where the struct is declared. Outside of the module, only pub fields are visible.

Item: Methods

An impl block can't be marked pub; rather its methods can be marked pub individually.

Private methods, like private struct fields, are visible throughout the module where they're declared.

Item: Constants

The const keyword introduces constant. const syntax is just like let, except that the type must be defined, and it may or may not be marked pub.

Convention: UPPER_CASE_NAMES are conventional for naming constants.

Concept: A const is a bit like the #define: preprocessor directive in C++, and as such they should be used for specifying magic numbers and strings.

Item: Imports

Even though use and extern crate declarations are just aliases, they can also be marked pub. In fact, the standard prelude is written as a big series of pub imports.

Item: FFI blocks

extern blocks declare a collection of functions written in some other language so that they can be called from Rust.

Turning a Program into a Library

These are roughly the steps to convert a program into a library:

  1. Change the name of src/main.rs to src/lib.rs
  2. Add the pub keyword to public features of the library.
  3. Move the main function to a temporary file somewhere.

Term: The code in src/lib.rs forms the root module of the library. Other crates that use your library can only access the public items of this root module.

The src/bin Directory

Cargo has built-in support for small programs that live in the same codebase as a library.

The main() function that we stowed away in the above steps for converting our code to library should be moved to a file named src/bin/my_program.rs. The file needs to then import the library as it would any other crate:


#![allow(unused)]
fn main() {
extern crate my_library;
use my_library::{feature_1, feature_2};
}

That's it!

Attributes

Click here.

Tests and Documentation

Tests are just ordinary functions marked with the #[test] attribute.

To test error cases, add the #[should_panic] attribute to your test. This tells the compiler that we expect this test to panic:


#![allow(unused)]
fn main() {
#[test]
#[should_panic(expected="divide by zero")]
fn test_divide_by_zero_error() {
    1 / 0;
}
}

Convention: When your tests gete substantial enough to require support code, the convention is to put them in a tests module an declare the whole module to be testing-only using the #[cfg(test)] attribute.

Integration Tests

Term: Integration tests are .rs files that live in a tests directory alongside your project's src directory. When you run cargo test, Cargo compiles each integration test as a separate, standalone crate, linked with your library and the Rust test harness.

Since integration tests use your program as if it were a separate crate, you must add extern crate my_library; to them.

Documentation

Come back to this

Doc-Tests

Come back to this

Specifying Dependencies

Generally in a Cargo.toml, you're used to seeing items in the [dependencies] section that are specified by version number and look like this:

[dependencies]
num = "0.1.42"

The above convention is fine, but it only allows use of crates published on crates.io.

Remote Git Dependencies

To use a dependency by referencing a git repo, specify it like this:

my_crate = { git = "https://github.com/Me/my_crate.git", rev = "093f84c" }

Local Dependencies

To use a dependency by referencing a local crate, specify it like this:

my_crate = { path = "../path/to/my_crate" }

Versions

Come back to the this.

Cargo.lock

Cargo upgrades dependencies to newer version only when you tell it to using cargo update, in which case it only upgrades to the latest dependency versions that are compatible with what's specified in Cargo.toml.

Publishing Crates to crates.io

The command cargo package creates a file containing all your library's source files, including Cargo.toml, which is what will be uploaded to crates.io.

Before publishing, you have to log in locally using cargo login <API key>. Go here to get an API key.

Workspaces

Given a root directory that contains a collection of crates, you can save compilation time and disk space by creating workspace.

All that's needed is a Cargo.toml file in the root directory:

[workspace]
members = ["my_first_crate", "my_second_crate"]

With that you're free to delete any Cargo.lock and target directories that exist in the subdirectories. All Cargo.lock and compiled resources will all be grouped at a single location in the root directory.

With workspaces, cargo build --all in any crate will build all crate in the root directory. The same goes for cargo test and cargo doc.

Structs

Rust has three kinds of structures:

  1. Named-field
  2. Tuple-like
  3. Unit-like

Term: The values contained within a struct, regardless of struct type, are called components.

A named-field struct gives a name to each component. A tuple-like struct identifies them by the order in which they appear. Unit-like structs have no components at all.

Structs are private by default, visible only in the module where they're declared. The sames goes for their fields.

Named-Field Structs

Convention: All types, structs includes, should have names in PascalCase.

Term: A struct expression is an expression that constructs a struct type:


#![allow(unused)]
fn main() {
// struct expression (similar to a constructor)
let r = std::ops::Range {
    start: 0,
    end: 9,
};
println!("Range length: {}", r.len());
}

Operator: If in a struct expression, the named fields are followed by ..EXPR, then any fields not mentioned take their values from EXPR, which must be another value of the same struct type.


#![allow(unused)]
fn main() {
let range_1 = 0..100;
println!("First range length: {}", range_1.len());
let range_2 = std::ops::Range {
    start: 50,
    ..range_1
};
println!("Second range length: {}", range_2.len());
}

Tuple-Like Structs

Term: The values held by a tuple-like structs are called elements.

Implicit Behavior: When you define a tuple-like struct, you implicitly create a function that constructs it:

struct Bounds(usize, usize);

This implicitly created this function:

fn Bounds(el0: usize, el1: usize) -> Bounds { ... }

Tuple-like structs are good for newtypes.

Term: Structs with a single component that you define to get stricter type checking are called newtypes.

Unit-Like Structs

Unit-like struct occupies no memory, much like the unit-type (). They're generally helpful when defining traits.

Struct Layout

In memory, both named-field and tuple-like structs are the same thing.

Defining Methods with impl

Concept: Rather than appearing inside the struct definition, as in C++ or Java, Rust methods appear in a separate impl block.

An impl block is a collection of fn definitions, each of which becomes a method on the struct type named at the top of the block.


#![allow(unused)]
fn main() {
#[derive(Debug)]
struct Point { x: i64, y: i64 }

impl Point {
    // Static method that creates a new point from a tuple
    fn of(x: i64, y: i64) -> Self {
        Point { x, y }
    }
    // Method that swaps the x and y values
    fn inverse(self: &Self) -> Self {
        Point { x: self.y, y: self.x }
    }
    // Method that mutably add one points to this one
    fn add(self: &mut Self, add: &Self) {
        self.x = self.x + add.x;
        self.y = self.y + add.y;
    }
}

let mut p1 = Point::of(1, 2);
println!("{:?}", p1);
let p2 = p1.inverse();
println!("{:?}", p2);
p1.add(&p2);
println!("{:?}", p1);
}

Term: Methods defined in impls are called associated functions, since they're associated with a specific type. The opposite of an associated function (one not associated with any type) is called a free function.

Shorthand: Inside impl blocks, Rust automatically creates a type alias of the type for which the impl block is associated called Self.

A Rust method must explicitly use self to refer to the value it was called on.

Implicit Behavior: When you call a method, you don't need to borrow a mutable reference yourself; the ordinary method call syntax takes care of that implicitly. For example, in the above code, we call p1.add(&p2). This is the same as if we had called (&mut p1).add(&p2).

Term: Methods in impl blocks that don't take self as an argument become functions associated with the struct type itself, rather that a specific value of the type. These methods are called static methods. In the above code, Point::of is a static method of struct type Point.

Convention: It's conventional in Rust for static constructor functions to be named new.

Although you can have many separate impl blocks for a single type, they must all be in the same crate that defines that type.

Generic Structs

Term: In generic struct definitions, the type names used in are called type parameters.

Pronunciation: You can read the line impl<T> Queue<T> as something like "for any type T, here are some methods available on Queue".

Operator: For static method calls whose generic type parameter cannot be inferred, you can use the turbofish ::<> operator to specify the type:


#![allow(unused)]
fn main() {
let mut q = Queue::<char>::new();
}

Structs with Lifetime Parameters

Just as structs can have generic type parameters, they can have lifetime parameters as well.

Pronunciation: You can read the line struct Extrema<'elt> as something like, "given any specific life 'elt, you can make an Extrama<'elt> that holds references with that lifetime.

Rust always infers lifetime parameters for calls.

Shorthand: Because it's so common for the return type to use the same lifetime as an argument, Rust lets us omit the lifetimes when there's one obvious candidate.

Interior Mutability

Interior mutability is the principle of making a bit of data mutable inside an otherwise immutable value.

The two most straightforward mechanisms for implementing interior mutability are Cell<T> and RefCell<T>.

A Cell<T> is a struct that contains a single private value of type T. The only special thing about a Cell is that you can get and set the field even if you don't have mut access to the Cell itself.

Warning: Cells, and any types that contain them, are not thread-safe.

Come back to this. Probably won't need it for a while.

Enums and Patterns

Enums

Term: The values that comprise enums are called variants or constructors.

As with structs, the compiler will implement features like == operator for you, but you have to ask. Also as with structs, enums can have methods within impl blocks.


#![allow(unused)]
fn main() {
#[derive(Copy, Clone, Debug, PartialEq)]
enum TimeUnit {
    Seconds, Minutes, Hours, Days, Months, Years,
}
impl TimeUnit {
    // Return the plural noun for this time unit
    fn plural(self) -> &'static str {
        match self {
            TimeUnit::Seconds => "seconds",
            TimeUnit::Minutes => "minutes",
            TimeUnit::Hours => "hours",
            TimeUnit::Days => "days",
            TimeUnit::Months => "months",
            TimeUnit::Years => "years",
        }
    }

    // Return the singular noun for this time unit
    fn singular(self) -> &'static str {
        self.plural().trim_right_matches('s')
    }
}
}

Enums with Data

Term: Enum constructors that take arguments that resemble tuples are called tuple variants. The constructors that take struct arguments are called struct variants. Constructors that take no arguments are called unit-like variants.


#![allow(unused)]
fn main() {
// Enum with tuple variants
enum RoughTime {
    InThePast(TimeUnit, u32),
    JustNow,
    InTheFuture(TimeUnit, u32),
}
}

#![allow(unused)]
fn main() {
// Enum with struct variants
enum Shape {
    Sphere { center: Point3d, radius: f32 },
    Cuboid { corner1: Point3d, corner2: Point3d }
}
}

A single enum have can variants of all three kinds:


#![allow(unused)]
fn main() {
// Enum with unit-like, tuple, and struct variants
enum RelationshipStatus {
    Single,
    InARelationShip,
    ItsComplicated(Option<String>),
    ItsExtremelyComplicated {
        car: DifferentialEquation,
        cdr: EarlyModernistPoem,
    }
}
}

All constructors and fields of a public enum are automatically public.

Enums in Memory

In memory, enums with data are stored as a small integer tag, plus enough memory to hold all of the fields of the largest variant. The tag tells Rust which constructor created the value, and therefore which fields it has.

Generic Enums

Enums can be generic, and generic data structures can be built with a few lines of code:


#![allow(unused)]
fn main() {
// An ordered collection of T's
#[derive(Debug)]
enum BinaryTree<T> {
    Empty,
    NonEmpty(Box<TreeNode<T>>),
}

// A node within the binary tree
#[derive(Debug)]
struct TreeNode<T> {
    element: T,
    left: BinaryTree<T>,
    right: BinaryTree<T>,
}

let tree = BinaryTree::NonEmpty(Box::new(TreeNode {
    element: "I'm a single-node tree",
    left: BinaryTree::Empty,
    right: BinaryTree::Empty,
}));

println!("{:?}", tree);
}

Patterns

match performs pattern matching. Think of it this way:

  • Expressions produce values
  • Patterns consume values

When a pattern contains identifiers, those become local variables in the code following the pattern.

Literals, Variables, and Wildcards in Patterns

Term: If you need a catch-all pattern, but don't care about the matched value, you can use a single underscore _ as a pattern, called the wildcard pattern.

Rust requires that every single possible value is handled in a match block. So even if you're certain that remaining cases can't occur, you at least add a fallback arm that panics:

Warning: Existing variables can't be used in patterns. This is because identifiers in patterns may only introduce new variables.


#![allow(unused)]
fn main() {
// This will fail because current_hex is an existing variable
fn check_move(current_hex: Hex, click: Point) -> game::Result<Hex> {
    match point_to_hex(click) {
        None => Err("That's not a game space."),
        Some(current_hex) => Err("You're already there! Click somewhere else."),
        Some(other_hex) => Ok(other_hex),
    }
}
}

Tuple and Struct Patterns

Tuple patterns match tuples.


#![allow(unused)]
fn main() {
// Describe the location of a point on a Cartesian plane
fn describe_point(x: i32, y: i32) -> &'static str {
    use std::cmp::Ordering::*;
    match (x.cmp(&0), y.cmp(&0)) {
        (Equal, Equal) => "at the origin",
        (_, Equal) => "on the x axis",
        (Equal, _) => "on the y axis",
        (Greater, Greater) => "in the first quadrant",
        (Less, Greater) => "in the second quadrant",
        _ => "somewhere else",
    }
}
}

Struct patterns match structs.


#![allow(unused)]
fn main() {
match balloon.location {
    Point { x: 0, y } => println!("straight up {} meters", height),
    Point { x, y } => println!("at ({}m, {}m)", x, y),
}
}

Reference Patterns

For very large struct type, it'd be too cumbersome to write out every single struct field in the pattern. Fortunately, you can use the .. operator to mute the fields you don't care about:


#![allow(unused)]
fn main() {
match account {
    Account { name, language, .. } => {
        ui.greet(&name, &language);
        ui.show_settings(&account); // ERROR! use of moved value 'account'
    }
}
}

Keyword: The above code will fail because when we use .., the rest of the Account struct is dropped. So we need a pattern that borrow matched values instead of moving them. For that, we have ref (or mut ref, depending on context):


#![allow(unused)]
fn main() {
match account {
    Account { ref name, ref language, .. } => {
        ui.greet(&ame, language);
        ui.show_settings(&account); // OK!
    }
}
}

Concept: The opposite of a ref pattern is a pattern that starts with &. If a pattern starts with &, that means that it matches a reference:


#![allow(unused)]
fn main() {
match sphere.center() {
    &Point3d { x, y, z } => { ... }
}
}

You should remember that patterns and expressions are natural opposites:

  • The expression (x, y) makes two values into a new tuple
  • The pattern (x, y) matches a tuple and breaks out the two values

The same principle applies to references:

  • In an expression, & creates a reference
  • In a pattern, & matches a reference

Matching Multiple Possibilities

Operator: The vertical bar | can be used to combine several patterns in a single match arm:


#![allow(unused)]
fn main() {
let at_end = match chars.peek() {
    Some(&'\r') | Some(&'\n') | None => true,
    _ => false,
};
}

You can also use ... to match a whole range of values:


#![allow(unused)]
fn main() {
match next_char {
    '0' ... '9' => self.read_number(),
    'a' ... 'z' | 'A' ... 'Z' => self.read_word(),
    ' ' | '\t' | '\n' | '\r' => self.skip_whitespace(),
    _ => self.handle_punctuation(),
}
}

Pattern Guards

Use the if keyword to add a guard to a match arm. But, if a pattern moves any values, you can't put a guard on it.


#![allow(unused)]
fn main() {
match robot.last_known_location() {
    Some(ref point) if self.distance_to(point) < 10 => short_distance_strategy(point),
    Some(point) => long_distance_strategy(point),
    None => searching_strategy(),
}
}

@ Patterns

The x @ pattern matches like like the given pattern, but on success, instead of creating variables for parts of the matched value, it creates a single variable x and moves or copies the whole value into it.


#![allow(unused)]
fn main() {
match self.get_selection() {
    rect @ Shape::Rect(..) => optimized_paint(&rect),
    other_shape => paint_outline(other_shape.get_outline()),
}
}

The @ pattern is also useful for ranges:


#![allow(unused)]
fn main() {
match chars.next() {
    Some(digit @ '0' ... '9') => read_number(digit, chars),
    ...
}
}

Where Patterns are Allowed

Come back to this.

Populating a Binary Tree

Finally we'll go back to the BinaryTree enum written earlier and write an add method for it that allows use to easily build a binary tree.


#![allow(unused)]
fn main() {
// An ordered collection of T's
#[derive(Debug)]
enum BinaryTree<T> {
    Empty,
    NonEmpty(Box<TreeNode<T>>),
}

// A node within the binary tree
#[derive(Debug)]
struct TreeNode<T> {
    element: T,
    left: BinaryTree<T>,
    right: BinaryTree<T>,
}
impl<T: Ord> BinaryTree<T> {
    fn add(&mut self, value: T) {
        // *self inside a match represents the existing tree
        match *self {
            BinaryTree::Empty =>
                *self = BinaryTree::NonEmpty(Box::new(TreeNode {
                    element: value,
                    left: BinaryTree::Empty,
                    right: BinaryTree::Empty,
                })),
            BinaryTree::NonEmpty(ref mut node) =>
                if value <= node.element {
                    node.left.add(value);
                } else {
                    node.right.add(value);
                }

        }
    }
}

let mut tree = BinaryTree::Empty;
for num in 0 .. 10 {
    tree.add(num);
}

println!("{:?}", tree);
}

Traits and Generics

Intro to Traits

Rust's implementation of polymorphism comes from two mechanisms:

  1. Traits
  2. Generics

Traits are Rust's take on the interfaces or abstract base classes found in OOP-world.

Here's a condensed version of the std::io::Write trait:


#![allow(unused)]
fn main() {
// std::io::Write
trait Write {
    fn write(&mut self, buf: &[u8]) -> Result<usize>;
    fn flush(&mut self) -> Result<()>;
    fn write_all(&mut self, buf: &[u8]) -> Result<()>;
    // There's lots more
}
}

Assume we wait to write a function whose parameter is a value of any type that can write to a stream. It'd look something like this:


#![allow(unused)]
fn main() {
use std::io::Write;

fn say_hello(out: &mut Write) -> std::io::Result<()> {
    out.write_all(b"hello!\n")?;
    out.flush();
}
}

Pronunciation: The parameter of the above out function is of type &mut Write, meaning "a mutable reference to any value that implements the Write trait.

Intro to Generics

A generic function or type can be used with values of many different types.


#![allow(unused)]
fn main() {
// Given two values, pick whichever one is less
fn min<T: Ord>(value1: T, value2: T) -> T {
    if value1 <= value2 {
        value1
    } else {
        value2
    }
}
println!("Minimum of two integers: {}", min(1, 2));
println!("Minimum of two strings: {}", min("a", "b"));
}

Pronunciation: The type parameter of the above min function is written <T: Ord>, meaning "this function can be used with arguments of any type T that implements the Ord trait". Or, more simply, "any ordered type".

Term: The T: Ord requirement of the above min function is called a bound.

Using Traits

A trait is a feature that any given type may or may not support. Think of a trait as a type capability.

Rule: For trait methods to be accessible, the trait itself must be in scope! Otherwise, all of its methods are hidden.


#![allow(unused)]
fn main() {
let mut buf: Vec<u8> = vec![];
buf.write_all(b"hello!")?; // ERR: no method named write_all
}

Adding use std::io::Write; to the top of the above file will bring the Write trait into scope and fix the issue.

Trait Objects

There are two ways to use traits:

  1. Trait objects
  2. Generics

Rust doesn't allow variables of type Write (the trait) because a variable's size must be known at compile-time, and types that implement Write can be of any size.


#![allow(unused)]
fn main() {
use std::io::Write;

let mut buf: Vec<u8> = vec![];
let writer: Write = buf; // ERR: `Write` does not have a constant size
}

However, what we can do is create a value that's a reference to a trait.


#![allow(unused)]
fn main() {
use std::io::Write;

let mut buf: Vec<u8> = vec![];
let writer: &mut Write = &mut buf; // OK!
}

Term: A reference to a trait type, like writer in the above code, is called a trait object.

Trait Object Layout

In memory, a trait object is a fat pointer (two words on the stack) consisting of a pointer to the value, plus a pointer to a table representing that value's type. That table, as is the case with C++, is called a virtual table (vtable).

Implicit Behavior: Rust automatically converts ordinary referencs into trait object when needed. This was the case with the writer variable in the above code.

Generic Functions

Earlier we created a function that accepted any parameter that implemented the Write trait (aka, a trait object):


#![allow(unused)]
fn main() {
use std::io::Write;
fn say_hello(out: &mut Write) -> std::io::Result<()> {
    out.write_all(b"hello!\n")?;
    out.flush();
}
}

We can make that function generic by tweaking the type signature:


#![allow(unused)]
fn main() {
use std::io::Write;
fn say_hello<W: Write>(out: &mut W) -> std::io::Result<()> {
    out.write_all(b"hello!\n")?;
    out.flush();
}
}

Term: In the above say_hello function, the phrase <W: Write> is what makes the function generic. W is called a type parameter. And : Write, as mentioned earlier, is the bound.

Convention: Type parameters are usually single uppercase letters.

If the generic function you're calling doesn't have any arguments that provide useful clues about the type parameter's type, you might have to spell it out using the turbofish ::<>.

Operator: If your type parameter needs to support several traits, you can chain the needed traits together using the + operator.


#![allow(unused)]
fn main() {
fn top_ten<T: Debug + Hash + Eq>(values: &Vec<T>) { ... }
}

Generic functions can have multiple type parameters:


#![allow(unused)]
fn main() {
fn run_query<M: Mapper + Serialize, R: Reducer + Serialize>(
    data: &DataSet, map: M, reduce: R,
) -> Results {
    ...
}
}

Keyword: The type parameter bounds in the above run_query function are way too long and it makes it less readable. The where keyword allows us to move the bounds outside of the <>:


#![allow(unused)]
fn main() {
fn run_query<M, R>(data: &DataSet, map: M, reduce: R) -> Results
    where M: Mapper + Serialize,
          R: Reducer + Serialize {
              ...
}
}

Shorthand: The where clause can be used anywhere bounds are permitted: generic structs, enums, type aliases, methods, etc.

A generic function can have both lifetime parameters and type parameters. Lifetime parameters come first:


#![allow(unused)]
fn main() {
// Return a ref to the point in `candidates` that's closest to `target`
fn nearest<'t, 'c, P>(target: &'t P, candidates: &'c [P]) -> &'c P
    where P: MeasureDistance {
    ...
}
}

Which to Use

Tip: Traits objects are the right choice whenever you need a collection of values of mixed types, all together. (think salad)

Generics have two major advantages over trait objects:

  1. Speed. When the compiler generates machine code for a generic function, it knows which types it's working with, so it knows at that time which write method to call. No need for dynamic dispatch. Wheras with trait objects, Rust never knows what type of value a trait object points to until runtime.
  2. Not every trait can support trait objects.

Defining and Implementing Traits

Defining a trait is just a matter of giving it a name and a list of type signatures of the trait's methods.


#![allow(unused)]
fn main() {
/// A trait for entities in a videogame's world that are displayed on a screen
trait Visible {
    /// Render the object on the given canvas
    fn draw(&self, canvas: &mut Canvas);

    /// Return true if clicking at (x, y) should select this object
    fn hit_test(&self, x: i32, y: i32) -> bool;
}
}

Syntax: The syntax for implementing a trait is the following: impl TraitName for Type

Implementing the Visible trait for the Broom type might look like this:


#![allow(unused)]
fn main() {
impl Visible for Broom {
    fn draw(&self, canvas: &mut Canvas) {
        for y in self.y - self.height - 1 .. self.y {
            canvas.write_at(self.x, y, '|');
        }
        canvas.write_at(self.x, self.y, 'M');
    }

    fn hit_test(&self, x: i32, y: i32) -> bool {
        self.x == x
        && self.y - self.height - 1 <= y
        && y <= self.y
    }
}
}

Default Methods

Term: Methods listed within traits can have default implementations. In such cases, it's not required that a type implementing the trait explicitly define the method.

Traits and Other People's Types

Rule: Rust lets you implement any trait on any type, as long as either the trait or the type is introduced in the current trait. This is called the coherence rule. It helps Rust ensure that trait implementations are unique.

Term: A trait that adds a single method to a type is called an extension traits.

Generic impl blocks can be used to add an extension trait to a whole family of types at once.


#![allow(unused)]
fn main() {
// Add the `write_html` method to all types that implement `Write`
use std::io::{self, Write};

// Trait for values to which you can send HTML
trait WriteHtml {
    fn write_html(&mut self, &HtmlDocument) -> io::Result<()>;
}

// Add the HTML write capability to any std:io writer
impl<W: Write> WriteHtml for W {
    fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()> {
        ...
    }
}
}

Self in Traits

Traits can use the keyword Self as a type.


#![allow(unused)]
fn main() {
pub trait Clone {
    fn clone(&self) -> Self;
}
}

A trait that uses the Self type is incompatible with trait objects.


#![allow(unused)]
fn main() {
// ERR: the trait `Spliceable` cannot be made into an object
fn splice_anything(left: &Spliceable, right: &Spliceable) {
    let combo = left.splice(right);
    ...
}
}

Subtraits

We can declare that a trait is an extension of another trait.


#![allow(unused)]
fn main() {
// A living item in our videogame world
trait Creature: Visible {
    fn position(&self) -> (i32, i32);
    fn facing(&self) -> Direction;
    ...
}
}

Static Methods

Traits can include static methods and constructors.


#![allow(unused)]
fn main() {
trait StringSet {
    // constructor
    fn new() -> Self;

    // static method
    fn from_slice(strings: &[&str]) -> Self;
}
}

Trait objects don't support static methods.

Fully Qualified Method Calls

Term: A qualified method call is one that specifies the type or trait that a method is associated with. A fully qualified method call is one that specifies both type and trait.

Method CallQualification
"hello".to_string() 
str::to_string("hello") 
ToString::to_string("hello")qualified
<str as ToString>::to_string("hello")fully qualified

When You Need Them

Generally, you'll use value.method() to call a method, but occasionally you'll need a qualified method call:

  1. When two methods have the same name:

#![allow(unused)]
fn main() {
// Outlaw is a type that implements Visible and HasPistol, both of which have a `draw` method
let mut outlaw = Outlaw::new();

outlaw.draw(); // ERR: draw on the screen or draw pistol?

Visible::draw(&outlaw); // OK!
HasPistol::draw(&outlaw); // OK!
}
  1. When the type of the self argument can't be inferred:

#![allow(unused)]
fn main() {
let zero = 0; // all we know so far is this could be i8, u8, i32, etc

zero.abs(); // ERR: which `.abs()` should be called?
i64::abs(zero); // OK!
}
  1. When using the function itself as the function value:

#![allow(unused)]
fn main() {
let words: Vec<String> =
    line.split_whitespace()
        .map(<str as ToString>::to_string) // OK!
        .collect();
}
  1. When calling trait methods in macros.

Traits That Define Relationships Between Types

Traits can be used in situations where there are multiple types that have to work together. They can describe relationships between types.

Associated Types (or How Iterators Work)

Rust's standard iterator trait looks a little like this:


#![allow(unused)]
fn main() {
trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
}
}

Term: In the Iterator trait, Item is called an associated type. Each type that implements Iterator must specify what type of item it produces.

The implementation of Iterator for std::io::Args looks a bit like this:


#![allow(unused)]
fn main() {
impl Iterator for Args {
    // the associated `Item` type for `Args` is a `String`
    type Item = String;
    fn next(&mut self) -> Option<String> {
        ...
    }
}
}

Bounds can be placed on a trait's associated type.


#![allow(unused)]
fn main() {
fn dump<I>(iter: I) where I: Iterator, I::Item: Debug {
    ...
}
}

Or, we can place bounds on an associated type as if it were a generic type parameter of the trait:


#![allow(unused)]
fn main() {
fn dump<I>(iter: I) where I: Iterator<Item=String> {
    ...
}
}

Use Case: Associated types are perfect for cases where each implementation has one specific related type.

Generic Traits (or How Operator Overloading Works)

The trait signature for Rust's multiplication method looks a bit like this:


#![allow(unused)]
fn main() {
pub trait Mul<RHS=Self> {
    ...
}
}

The syntax RHS=Self means that the type parameter RHS defaults to Self.

Buddy Traits (or How rand::random() Works)

Term: Traits that are designed to work together are called buddy traits.

A good example of buddy trait use is in the rand, particularly the random() method, which returns a random value:


#![allow(unused)]
fn main() {
let x = rand::random();
}

Rust wouldn't be able to infer the type of x so we'd need to specify it with turbofish:


#![allow(unused)]
fn main() {
let x = rand::random::<f64>(); // float between 0.0 and 1.0
let b = rand::random::<bool>(); // true or false
}

But rand has many different kinds of random number generators (RNGs). They all implement the same trait, Rng:


#![allow(unused)]
fn main() {
// An Rng is just a value that can spit out integers on demand,
pub trait Rng {
    fn next_u32(&mut self) -> u32;
    ...
}
}

There are lots of implementations of Rng: XorShiftRing, OsRng, etc.

The Rng has a buddy trait called Rand:


#![allow(unused)]
fn main() {
// A type that can be randomly generated using an `Rng`
pub trait Rand: Sized {
    fn rand<R: Rng>(rng: &mut R) -> Self;
}
}

Rand is implemented by the types that are produced by Rng: u64, bool, etc.

Ultimately, rand::random() is just a thin wrapper that passes a globally allocated Rng to Rand::rand():


#![allow(unused)]
fn main() {
pub fn random<T: Rand>() -> T {
    T::rand(&mut global_rng())
}
}

Concept: When you see traits that use other traits as bounds, the way Rand::rand() uses Rng, you know those two traits are mix-and-match (buddy traits). Any Rng can generate values of every Rand type.

Operator Overloading

Go here for references.

Arithmetic and Bitwise Operators

Here's the definition of std::ops::Add:


#![allow(unused)]
fn main() {
trait Add<RHS=Self> {
    type Output;
    fn add(self, rhs: RHS) -> Self::Output;
}
}

In other words, the trait Add<T> is the ability to add a T value to yourself.

We could implement Add generically for the Complex number type like this:


#![allow(unused)]
fn main() {
use std::ops::Add;

impl<T> Add for Complex<T> where T: Add<Output=T> {
    type Output = Self;
    fn add(self, rhs: Self) -> Self {
        Complex { re: self.re + rhs.re, im: self.im + rhs.im }
    }
}
}

Unary Operators

The two overloadable unary operators (! and -) are defined like this:


#![allow(unused)]
fn main() {
trait Not {
    type Output;
    fn not(self) -> Self::Output;
}

trait Neg {
    type Output;
    fn neg(self) -> Self::Output;
}
}

An implementation of Neg for Complex values might look like this:


#![allow(unused)]
fn main() {
impl<T, O> Neg for Complex<T> where T: Neg<Output=O> {
    type Output = Complex<O>;
    fn neg(self) -> Complex<O> {
        Complex { re: -self.re, im: -self.im }
    }
}
}

Binary Operators

The definition of std::ops::BitXor looks like this:


#![allow(unused)]
fn main() {
trait BitXor<RHS=Self> {
    type Output;
    fn bitxor(self, rhs: RHS) -> Self::Output;
}
}

Compound Assignment Operators

Warning: Unlike other languages, the value of a compound assignent expression is always (). e.g. x += y returns ().

The definition of std::ops::AddAssign looks like this:


#![allow(unused)]
fn main() {
trait AddAssign<RHS=Self> {
    fn add_assign(&mut self, RHS);
}
}

An implementation of AddAssign for Complex values might look like this:


#![allow(unused)]
fn main() {
impl<T> AddAssign for Complex<T> where T: AddAssign<T> {
    fn add_assign(&mut self, rhs: Complex<T>) {
        self.re += rhs.re;
        self.im += rhs.im;
    }
}
}

Warning: Overloading an arithmetic operator like Add does not automatically include overload implementation for its corresponding AddAssign operator.

Equality Tests

Since the ne method of the PartialEq trait already has a default implementation, you'll only ever need to implement the eq method.

PartialEq takes its values by reference.

Here's the definition of std::cmp::PartialEq:


#![allow(unused)]
fn main() {
trait PartialEq<Rhs: ?Sized = Self> {
    fn eq(&self, other: &Rhs) -> bool;
    // `ne` has a default implementation
    fn ne(&self, other: &Rhs) -> bool { !self.eq(other) }
}
}

Syntax: The where Rhs: ?Sized bound relaxxs Rust's usual requirement that type parameters must be sized types, which lets us write traits like PartialEq<str> or PartialEq<[T]>.

Tip: In most cases, Rust can automatically implement PartialEq for your type for you if you add #[Derive(PartialEq)].

Ordered Comparisons

Ordered comparison operators all stem from the std::cmp::PartialOrd trait, which is defined as:


#![allow(unused)]
fn main() {
trait PartialOrd<Rhs = Self>: PartialEq<Rhs> where Rhs: ?Sized {
    fn partial_cmp(&self, other: &Rhs) -> Option<Ordering>;

    fn lt(&self, other: &Rhs) -> bool { ... }
    fn le(&self, other: &Rhs) -> bool { ... }
    fn gt(&self, other: &Rhs) -> bool { ... }
    fn ge(&self, other: &Rhs) -> bool { ... }
}
}

Note that PartialOrd is a subtrait of PartialEq. Meaning you can perform ordered comparison only on types that can also be checked for equality.

Also note that partial_cmp is the only method of the PartialOrd trait that doesn't have a default implementation. This means when you want to implement PartialOrd, you only need to define partial_cmp.

Index and IndexMut

Here are the definitions of the traits associated with the index operator:


#![allow(unused)]
fn main() {
trait Index<Idx> {
    type Output: ?Sized;
    fn index(&self, index: Idx) -> &Self::Output;
}

trait IndexMut<Idx>: Index<Idx> {
    fn index_mut(&mut self, index: Idx) -> &mut Self::Output;
}
}

The associated type Output specifies what an index expression returns.

Use Case: The most common use case for indexing and overloading the index operators is for collections.

In the Mandelbrot program, we accessed pixels with lines like this:


#![allow(unused)]
fn main() {
// current implementation treats pixels as a single row
pixels[row * bounds.0 + column] = ...; // UGLY
// what we want is to be able to access pixels as if it were a 2D array
image[row][column] = ...; // BETTER!
}

To achieve improved indexing in the above code, we could write something like this:


#![allow(unused)]
fn main() {
// declare a struct that holds the pixels and the image dimensions
#[derive(Debug)]
struct Image<P> {
    width: usize,
    pixels: Vec<P>,
}

// add a static constructor to the Image type
// the type parameter P is the pixel type
impl<P> Image<P> where P: Default + Copy {
    fn new(width: usize, height: usize) -> Image<P> {
        Image {
            width,
            pixels: vec![P::default(); width * height],
        }
    }

    fn height(&self) -> usize {
        self.pixels.len() / self.width
    }
}

// now we implement Index and IndexMut
// when we index into an Image<P>, we expect to get back a slice of P
// indexing the slice will give an individual pixel
impl<P> std::ops::Index<usize> for Image<P> {
    type Output = [P];
    fn index(&self, row: usize) -> &Self::Output {
        let start = row * self.width;
        &self.pixels[start .. start + self.width]
    }
}

impl<P> std::ops::IndexMut<usize> for Image<P> {
    fn index_mut(&mut self, row: usize) -> &mut [P] {
        let start = row * self.width;
        &mut self.pixels[start .. start + self.width]
    }
}

// Create an image 3 pixels wide and 3 pixels tall
let mut image = Image::<u32>::new(3, 3);
println!("image height {}", image.height());

// Draw a diagonal line through the image
for i in 0 .. image.width {
    image[i][i] = 255;
}
println!("{:?}", image);
}

Other Operators

The dereferencing operator (*val) and the dot operator for accessing fields and calling methods (val.field and val.method()), can be overloaded using the Deref and DerefMut traits.

Utility Traits

Drop

You can customize hwo Rust drops values of your type by implementing the std::ops::Drop trait:


#![allow(unused)]
fn main() {
trait Drop {
    fn drop(&mut self);
}
}

Implicit Behavior: The drop method of the Drop trait is called implicity by Rust, if you try to call it yourself, it'll be flagged as an error.

You'll never need to implement Drop unless you're defining a type that owns resources Rust doesn't already know about.

Warning: If a type implements Drop, it cannot implement Copy.

Sized

Term: A type whose values all have the same size in memory is called a sized type. In Rust, almost all types are sized types.

All sized types implement the std::marker::Sized trait, which has no methods nor associated types. Rust implements it automatically for all types to which it applies; you can't implement it yourself.

Use Case: The only use for the Sized trait is as a bound for type variables: a bound like T: Sized requires T to be a type whose size is known at compile time.

Term: A trait that can only be used as a type parameter bound, and cannot be explicitly implemented (like Sized), is called a marker trait.

Implicit Behavior: Since unsized types are so limited, Rust implicitly assumes that generic type parameters have a Sized bound. This mean that when you write struct S<T>, Rust assumes you mean struct S<T: Sized>.

Syntax: Since Rust assumes all type parameters have a Sized bound, you have to explicitly opt-out of it using the ?Sized syntax: struct S<T: ?Sized>.

Clone

The std::clone::Clone trait is for types that can make copies of themselves. It's a subtrait of Sized and is defined like this:


#![allow(unused)]
fn main() {
trait Clone: Sized {
    fn clone(&self) -> Self;
    fn clone_from(&mut self, source: &Self) {
        *self = source.clone()
    }
}
}

Warning: Cloning values can be computationally expensive!

The clone_from method modifies self into a copy of source.

Convention: In generic code, you should use clone_from whenever possible.

Tip: If your Clone implementation simply applies clone to each field of your type, then Rust can implement it for you by adding #[derive(Clone)] above your type definition.

Warning: The clone method of types that implement Clone must be infallible!

Copy

A type is Copy if it implements the std::marker::Copy marker trait, a subtrait of Clone and defined as:


#![allow(unused)]
fn main() {
trait Copy: Clone {}
}

Tip: Like Clone, Copy can be automatically implemented using #[derive(Copy)].

Deref and DerefMut

You can specify how dereferencing operators like * and . behave on your types by implementing the std::ops::Deref and std::ops::DerefMut traits:


#![allow(unused)]
fn main() {
trait Deref {
    type Target: ?Sized;
    fn deref(&self) -> &Self::Target;
}
// DerefMut is a subtrait of Deref
trait DerefMut: Deref {
    fn deref_mut(&mut self) -> &mut Self::Target;
}
}

Term: If inserting a deref call prevents a type mismatch, Rust insterts one for you. These are called deref coercions: one type is "coerced" into behaving as another.

Add notes about the implications of deref coercions.

Rust will apply several deref coercions in succession if necessary.

Use Case: The Deref and DerefMut traits are designed for implementing smart pointer types like Box, Rc, and Arc, and types that serve as owning versions of something you would frequently use by reference, the way Vec<T> and String serve as owning versions of [T] and [str].

Anti-Pattern: Do not implement Deref and DerefMut for a type just to make the Target type's methods appear on it automatically, the way a C++ base class's methods are visible on a subclass.

Warning: Rust applies deref coercions to resolve type conflicts, but it does not apply them to satisfy bounds on type variables.

Example

Say we have a struct called Selector<T> that has a field elements: Vec<T> and a field named current: usize that behaves like a pointer to the current element.

Given a value s of type Selector<T>, we want to be able to do these things:

  1. Use the expression *s to get the value of the current element
  2. Apply methods implemented by the type of the currently pointed to element
  3. Change the value of the currently pointed to element with *s = '?'

#![allow(unused)]
fn main() {
use std::ops::{Deref, DerefMut};

struct Selector<T> {
    elements: Vec<T>,
    current: usize,
}

// Implementing Deref allows us to use *s to get the current element
impl<T> Deref for Selector<T> {
    type Target = T;
    fn deref(&self) -> &T {
        &self.elements[self.current]
    }
}

// Implementing DerefMut allows us to set the value of the current element
impl<T> DerefMut for Selector<T> {
    fn deref_mut(&mut self) -> &mut T {
        &mut self.elements[self.current]
    }
}

let mut s = Selector { elements: vec!['x', 'y', 'z'], current: 2 };
println!("current element {}", *s);
println!("is alphabetic? {}", s.is_alphabetic());
*s = 'w';
println!("current element {}", *s);
}

Default

Types with a reasonably obvious default value can implement the std::default::Default trait:


#![allow(unused)]
fn main() {
trait Default {
    fn default() -> Self;
}
}

All of Rust's collection types (like Vec, HashMap, BinaryHeap, etc) implement Default, with default methods that return an empty collection.

Use Case: Default is commonly used to produce default values for structs that represent a large collection of parameters, most of which you won't usually want to change. (think options objects in JS)

A perfect example of making good use of Default is when using the OpenGL crate called glium. Drawing with OpenGL requires a ton of parameters, most of which you don't care about. So, we can use Default to provide those parameters for us:


#![allow(unused)]
fn main() {
let params = glium::DrawParameters {
    line_width: Some(0.02),
    point_size: Some(0.02),
    .. Default::default(),
}

target.draw(..., &params).unwrap();
}

Tip: Rust does not implicitly implement Default for struct types, but if all of a struct's fields implement Default, you can implement Default for the struct automatically using #[derive(Default)].

AsRef and AsMut

When a type implements AsRef<T>, that means you can borrow a &T from it efficiently; AsMut is the analogue for mutable references:


#![allow(unused)]
fn main() {
trait AsRef<T: ?Sized> {
    fn as_ref(&self) -> &T;
}
trait AsMut<T: ?Sized> {
    fn as_ref(&mut self) -> &mut T;
}
}

Use Case: AsRef is typically used to make functions more flexible in the argument types they accept. For instance, std::fs::File::open is declared like this:


#![allow(unused)]
fn main() {
fn open<P: AsRef<Path>>(path: P) -> Result<File>;
}

Concept: You can think of using AsRef as a type bound as a bit like function overloading.

In the above use case, what open really wants is a &Path, the type representing a filesystem path. But with the above declaration, open accepts anything it can borrow a &Path from--that is, anything that implements AsRef<Path>.

Use Case: It only makes sense for a type to implement AsMut<T> if modifying the given T cannot violate the type's invariants.

Anti-Pattern: You should avoid defining your own AsFoo traits when you could just implement AsRef<Foo> instead.

Borrow and BorrowMut

The std::borrow::Borrow trait is similar to AsRef: if a type implements Borrow<T>, then its borrow method efficiently borrows a &T from it. But Borrow imposes more restrictions.

Use Case: A type should implement Borrow<T> ony when a &T hashes and compares the same way as the value it's borrowed from. Borrow is valuable in dealing with keys in hash tables and trees, or when dealing with values that will be hashed or compared for some other reason.

Come back to the example implementation related to   HashMap.

From and Into

The std::convert::From and std::convert::To traits represent conversions that consume a value of one type, and return a value of another.

From and Into do not borrow; they take ownership of their argument, transform it, and then return ownership of the result back to the caller:


#![allow(unused)]
fn main() {
trait Into<T>: Sized {
    fn into(self) -> T;
}

trait From<T>: Sized {
    fn from(T) -> Self;
}
}

Use Case: Into is generally used to make functions more flexible in the arguments they accept. For example, in this code, the ping function can accept any type A that can be converted into an Ipv4Addr:


#![allow(unused)]
fn main() {
fn ping<A>(address: A) -> std::io::Result<bool>
    where A: Into<Ipv4Addr>
{
    let ipv4_address = address.into();
    ...
}
}

Use Case: The from method of From serves as a generic constructor for producing an instance of a type from some other single value.

Tip: Given an appropriate From implementation, you get the Into trait implementation trait for free!

Warning: from and into operations must be infallible!

ToOwned

The ToOwned trait is an alternative to Clone. Types that implement Clone must be sized. The std::borrow::ToOwned trait provides a slightly looser way to convert a reference to an owned value:


#![allow(unused)]
fn main() {
trait ToOwned {
    type Owned: Borrow<Self>;
    fn to_owned(&self) -> Self::Owned;
}
}

Borrow and ToOwned at Work: The Humble Cow

In some cases, you cannot decide whether to borrow or own a value until the program is running. The std::borrow::Cow type ("clone on write") provides one way to do this:


#![allow(unused)]
fn main() {
enum Cow<'a, B: ?Sized + 'a>
    where B: ToOwned
{
    Borrowed(&'a B),
    Owned(<B as ToOwned>::Owned),
}
}

Concept: A Cow<B> either borrows a shared reference to B, or owns a value from which we could borrow such a reference.

Use Case: One common use for Cow is to return either a statically allocated string constant or a computed string.

Example

Suppose you need to convert an error enum to a message via a function called describe.

Most variants can be handled with fixed strings, but others have additional data that should be included in the message. For such a case, you can return Cow<'static, str>.

Using Cow helps describe and its callers put off allocation until the moment it becomes necessary.


#![allow(unused)]
fn main() {
use std::borrow::Cow;

fn describe(error: &Error) -> Cow<'static, str> {
    match *error {
        Error::OutOfMemory => "out of memory".into(),
        Error::StackOverflow => "stack overflow".into(),
        Error::MachineOnFire => "machine on fire".into(),
        Error::Unfathomable => "machine bewildered".into(),
        Error::FileNotFound(ref path) => {
            format!("file not found: {}", path.display()).into()
        }
    }
}
}

Closures

Capturing Variables

Closures that Borrow

Simple example:


#![allow(unused)]
fn main() {
fn sort_by_statistic(cities: &mut Vec<City>, stat: Statistic) {
    cities.sort_by_key(|city| -city.get_statistic(stat));
}
}

Closures that Steal

Say we wanted to create a function that sorts a list of cities in a separate thread. It might look something like this:


#![allow(unused)]
fn main() {
fn start_sorting_thread(mut cities: Vec<City>, stat: Statistic)
    -> thread::JoinHandle<Vec<City>>
{
    // take ownership of stat
    let key_fn = move |city: &City| -> i64 { -city.get_statistic(stat) };

    // take ownership of cities and key_fn
    thread::spawn(move || {
        cities.sort_by_key(key_fn);
        cities
    })
}
}

In the above example, we had to add the move keyword before each closure.

Keyword: The move keyword tells Rust that a closure doesn't borrow the variables it uses: it steals them.

Rust therefore offers two ways for closures to get data from enclosing scopes:

  1. Moves
  2. Borrowing

Function and Closure Types

Structs may have function-typed fields.

In memory, function values are just the memory address of the function's machine code.

A function can take another function as an argument:


#![allow(unused)]
fn main() {
// Given a list of cities and a test function,
// return the number of cities that returned true from the test
fn count_selected_cities(
    cities: &Vec<City>,
    test_fn: fn(&City) -> bool,
) -> usize {
    let mut count = 0;
    for city in cities {
        if test_fn(city) {
            count += 1;
        }
    }
    count
}
}

Concept: Closures do not have the same type as functions!

Term: A value of type fn(&City) -> bool is called a function pointer.

The count_selected_cities's type signature must be changed if test_fn should be a closure instead of a function value:


#![allow(unused)]
fn main() {
fn count_selected_cities<F>(cities: &Vec<City>, test_fn: F) -> usize
    where F: Fn(&City) -> bool
{
    let mut count = 0;
    for city in cities {
        if test_fn(city) {
            count += 1;
        }
    }
    count
}
}

We've now genericized count_selected_cities. It'll accept test_fn of any type F, as long as F implements the special trait Fn(&City) -> bool.

Concept: Every closure has its own type, because a closure may contain data: values either borrowed or stolen from enclosing scope. So, every closure has an ad hoc type created by the compiler. But, every closure implements the Fn trait.

Closure Performance

Closures aren't allocated on the heap unless you put them in a Box, Vec, or other container.

Closures and Safety

Closures that Kill

Basically, double free errors are impossible in Rust.

FnOnce

Concept: Closures that drop values are not allowed to have Fn. Instead, they implement FnOnce, the trait of closures that can only be called once. The first time you call a FnOnce closure, the closure itself is used up.

FnMut

Concept: Closures that require mut access to a value, but don't drop any values, are FnMut closures.

Summary

These are the three categories of closures, in order of most broad to least:

TraitDescription
FnOnceCan only be called once, if the caller owns the closure.
FnMutCan be called multiple times if the closure itself is declared mut.
FnCan be called multiple times without restriction and also encompasses all fn functions.

Callbacks

Here's an example program that implements a basic router:


#![allow(unused)]

fn main() {
struct Request {
    method: String,
    url: String,
    headers: HashMap<String, String>,
    body: Vec<u8>,
}

struct Response {
    code: u32,
    headers: HashMap<String, String>,
    body: Vec<u8>,
}

type RouteCallback = Box<Fn(&Request) -> Response>;

struct BasicRouter {
    routes: HashMap<String, RouteCallback>,
}

impl BasicRouter {
    fn new() -> BasicRouter {
        BasicRouter { routes: HashMap::new() }
    }

    fn add_route<C>(&mut self, url: &str, callback: C)
        where C: Fn(&Request) -> Response + 'static
    {
        self.routes.insert(url.to_string(), Box::new(callback))
    }

    fn handle_request(&self, request: &Request) -> Response {
        match self.routes.get(&request.url) {
            None => not_found_response(),
            Some(callback) => callback(request),
        }
    }
}

}

Iterators

The Iterator and IntoIterator Traits

Term: An iterator is any value that implements the std::iter::Iterator trait. Put simply, an iterator is value that produces a sequence of values.

Term: The values an iterator produces are called items.

Term: The code that receives an iterator's items is called a consumer.

The heart of the Iterator trait is defined as:


#![allow(unused)]
fn main() {
trait Iterator {
    type Item; // the type of value the iterator produces
    fn next(&mut self) -> Option<Self::Item>;

    // ... a whole bunch of default methods
}
}

If there's a natural way to iterator over some type, it can implement std::iter::IntoIterator, whose into_iter method takes a value and returns an iterator over it:


#![allow(unused)]
fn main() {
trait IntoIterator where Self::IntoIter::Item == Self::Item {
    type Item; // the type of value the iterator produces
    type IntoIter: Iterator; // the type of the iterator value itself
    fn into_iter(self) -> Self::IntoIter;
}
}

Term: Any type that implements std::iter::IntoIterator is called an iterable.

Under the hood, every for loop is just shorthand for calls to IntoIterator and Iterator methods:


#![allow(unused)]
fn main() {
let elems = vec!["antimony", "arsenic", "aluminum", "selenium"];

// Iteration using a for loop..
println!("There's:");
for element in &elems {
    println!("- {}", element);
}

// Is actually just...
println!("Again! There's:");
let mut iterator = (&elems).into_iter();
while let Some(element) = iterator.next() {
    println!("- {}", element);
}
}

Implicit Behavior: All iterators automatically implement IntoIterator, with an into_iter method that simply returns the iterator.

Creating Iterators

iter and iter_mut Methods

Most collection types provide iter and iter_mut methods that return the natural iterators over the type, producing a shared or mutable reference to each item. The same applies to slices like &[T] and &str too.

IntoIterator Implementations

There are three main implementations of IntoIterator.

1. Shared Reference

Idiom: Given a shared reference to a collection, into_iter returns an iterator that produces shared references to its items.


#![allow(unused)]
fn main() {
for element in &collection { ... }
}

2. Mutable Reference

Idiom: Given a mutable reference to a collection, into_iter returns an iterator that produces mutable references to the items.


#![allow(unused)]
fn main() {
for element in &mut collection { ... }
}

3. By Value

Idiom: When passed a collection by value, into_iter returns an iterator that takes ownership of the collection and returns items by value; the items' ownership moves from the collection to the consumer, and the original collection is consumed in the process.


#![allow(unused)]
fn main() {
for element in collection { ... }
}

Not every type provides all three iterator implementations.

Slices implement two of the three IntoIterator variants; since they don't own their elements, there is no "by value" case.

Use Case: IntoIterator can be useful in generic code: you can use a bound like T: IntoIterator to restrict the tyep variable T to types that can be iterator over. Or, you can write T: IntoIterator<Item=U> to further require the iteration to produce a particular type U. For instance, we can create a dump function that receives an iterable whose items implement the Debug trait:


#![allow(unused)]
fn main() {
use std::fmt::Debug;

fn dump<T, U>(t: T) where T: IntoIterator<Item=U>, U: Debug {
    for u in t {
        println!("{:?}", u);
    }
}
dump(vec!["garbage", "rubbish", "waste"]);
}

drain Methods

A lot of collection types provide a drain method that takes a mutable reference to the collection and returns an iterator that passes ownership of each element to the consumer.


#![allow(unused)]
fn main() {
use std::iter::FromIterator;

let mut outer = "Earth".to_string();
let inner = String::from_iter(outer.drain(1..4));

println!("outer: {}", outer);
println!("inner: {}", inner);
}

If you need to drain an entire sequence, use the full range, .., as the argument.

Other Iterator Sources

Go here.

Iterator Adapters

Term: Given an iterator, the Iterator trait provides a huge selection of methods called adapters that consume one iterator and build a new one.

map and filter

A map iterator passes each item to its closure by value, and in turn, passes along ownership of the closure's result to its consumer.

A filter iterator passes each item to its closure by shared reference, retaining ownership in case the item is selected to be passed on to its consumer.

Concept: Calling an adapter on an iterator doesn't consume any items; it just returns a new iterator. The only way to actually get values is to call next (or some other indirect method, like collect, in which case no work takes place until collect starts calling next) on the iterator.

filter_map and flat_map

The filter_map adapter is similar to map, except that it lets its closure either transform the item into a new item or drop the item from the iteration. Thus, it's a bit like a combination of filter and map.

Use Case: The filter_map adapter is best in situations when the best way to decide whether to include an item in the iteration is to actually try to process it.

The flat_map iterator produces the concatenation of the sequences the closure returns.

scan

The scan adapter resembles map, except that the closure is given a mutable value it can consult, and has the option of terminating the iteration early. The closure must return an Option, which the scan iterator takes as its next item.

take and take_while

The Iterator trait's take and take_while adapters let you end an iteration after a certain number of items, or when a closure decides to cut things off.

Both take and take_while take ownership of an iterator and return a new iterator that passes along items from the first one, possible ending the sequence earlier.

skip and skip_while

The Iterator trait's skip and skip_while methods are the complement of take and take_while: they drop a certain number of items from the beginning of an iteration, or drop items until a closure finds one acceptable, and then pass the remaining items through unchanged.

Use Case: One common use for skip is to skip the command name when iterating over a programs command-line arguments:


#![allow(unused)]
fn main() {
for arg in std::end::args().skip(1) {
    println!("arg: {}", arg);
}
}

peekable

A peekable iterator lets you peek at the next item that will be produced without actually consuming it. Almost any iterator can be turned into a peekable iterator by calling the Iterator trait's peekable method.

Calling peek tries to draw the next item from the underlying iterator, and if there is one, caches it until the next call to next.

Use Case: Peekable iterators are essential when you can't decide how many items to consume from an iterator until you've gone too far. For example, if you're parsing numbers from a stream of characters, you can't decide where the number ends until you've seen the first non-number character:


#![allow(unused)]
fn main() {
use std::iter::Peekable;

fn parse_number<I>(tokens: &mut Peekable<I>) -> u32
    where I: Iterator<Item=char>
{
    let mut n = 0;
    loop {
        match tokens.peek() {
            Some(r) if r.is_digit(10) => {
                n = n * 10 + r.to_digit(10).unwrap();
            }
            _ => return n
        }
        tokens.next();
    }
}

let mut chars = "10212980".chars().peekable();
println!("{}", parse_number(&mut chars));
}

fuse

The fuse adapter takes any iterator and turns it into one that will definitely continue to return None once it has done so the first time.

Use Case: The fuse adapter is most useful in generic code that needs to work with iterators of an uncertain origin.

Reversible Iterators and rev

Some iterators are able to draw items from both ends of the sequence. You can reverse these iterators by using the rev adapter.

Most iterator adapters, if applied to a reversible iterator, return another reversible iterator.

inspect

The inspect adapter is handy for debugging pipelines of iterator adapters, but is rarely used in production code. It applies a closure to a shared reference to each item, and then passes the item through. The closure can't affect the items, but it can do things like print them or make assertions about them.

chain

The chain adapter appends one iterator to another (think of the concat operator in RxJS). A chain iterator keeps track of whether each of the two underlying iterators has return None, and directs next and next_back calls to one or the other as appropriate.


#![allow(unused)]
fn main() {
let v: Vec<_> = (1..4).chain(4..6).collect();
println!("{:?}", v);
}

enumerate

The Iterator trait's enumerate adapter attaches a running index to the sequence, taking an iterator that produces items A, B, C, ... and returning an iterator that produces pairs (0, A), (1, B), (2, C), ....

zip

The zip adapter combines two iterators into a single iterator that produces pairs holding one value from each iterator. The zipped iterator ends when either of the two underlying iterators ends.

by_ref

An iterator's by_ref method borrows a mutable reference to the iterator, so that you can apply adaptors to the reference. When you're done consuming items from these adaptors, you drop them, the borrow ends, and you regain access to the original iterator.

by_ref essentially provides a mechanism for starting and stopping iterators as needed.

Concept: When you call an adapter on a mutable reference to an iterator, the adapter takes ownership of the reference, not the iterator itself.

cloned

The cloned adapter takes an iterator that produces references, and returns an iterator that produces values cloned from those references.

cycle

The cycle adapter returns an iterator that endlessly repeats the sequence produced by the underlying iterator. The underlying iterator must implement std::clone::Clone, so that cycle can save its initial state and reuse it each time the cycle starts again.


#![allow(unused)]
fn main() {
use std::iter::{once, repeat};

let fizzes = repeat("").take(2).chain(once("fizz")).cycle();
let buzzes = repeat("").take(4).chain(once("buzz")).cycle();
let fizzes_buzzes = fizzes.zip(buzzes);

let fizz_buzz = (1..100).zip(fizzes_buzzes)
    .map(|tuple|
        match tuple {
            (i, ("", "")) => i.to_string(),
            (_, (fizz, buzz)) => format!("{}{}", fizz, buzz)
        }
    );

for line in fizz_buzz {
    println!("{}", line);
}
}

Consuming Iterators

Simple Accumulation: count, sum, product

The count method draws items from an iterator until it returns None, and tells you how many it got.

The sum and product methods compute the sum or product of the iterator's items, which must be integers or floating-point numbers.

max, min

The max and min methods on Iterator return the least or greatest item the iterator produces. The iterator's item type must implement std::cmp::Ord, so that items can be compared with each other.

An implication of the Ord bound is that these methods can't be used with floating-point values.

max_by, min_by

The max_by and min_by methods return the maximum or minimum item an iterator produces, as determined by a comparator function you provide.

max_by_key, min_by_key

The max_by_key and min_by_key methods on Iterator let you select the maximum or minimum item as determined by a closure applied to each item.

Comparing Item Sequences

any and all

The any and all methods apply a closure to each item the iterator produces, and return true if the closure returns true for any or all items, respectively.

These methods consume only as many items as they need to determine the answer.

position, rposition, and ExactSizeIterator

The position method applies a closure to each item from the iterator and returns the index of the first item for which the closure returns true as an Option<bool>.

The rposition methods does the same thing but in reverse.

Term: An exact-size iterator is one that implements the std::iter::ExactSizeIterator trait:


#![allow(unused)]
fn main() {
trait ExactSizeIterator: Iterator {
    fn len(&self) -> usize { ... }
    fn is_empty(&self) -> bool { ... }
}
}

fold

The fold method is a very general tool for accumulating some sort of result over the entire sequence of items an iterator produces.

nth

The nth method takes an index n, skips that many items from the iterator, and returns the next item, or None if the sequence ends before that point.

Calling .nth(0) is equivalent to calling .next().

It doesn't take ownership of the iterator the way an adapter does, so you can call it many times.

last

The last method consumes items until the iterator returns None, and then returns the last item. If the iterator produces no items, then last returns None.

Tip: If you have a reversible iterator and just want the last item, use iter.rev().next() instead.

find

The find method draws items from an iterator, returning the first item for which the given closure returns true, or None if the sequence ends before a suitable item is found.

Building Collections: collect and FromIterator

An iterator's collect method can build any kind of collection from the Rust's standard library, as long as the iterator produces a suitable item type. The return type of collect is its type parameter.

When some collection type like Vec or HashMap knows how to construct itself from an iterator, it implements the std::iter::FromIterator trait, for which collect is a method:


#![allow(unused)]
fn main() {
trait FromIterator<A>: Sized {
    fn from_iter<T: IntoIterator<Item=A>>(iter: T) -> Self;
}
}

Concept: If a collection type implements FromIterator<A>, then its static method from_iter builds a value of that type from an iterable producing items of type A.

The size_hint method of Iterator returns a lower bound and optional upper bound on the number of items the iterator will produce.

The Extend Trait

If a type implements the std::iter::Extend trait, then its extend method adds an iterable's items to the collection.

All of the standard collections implement Extend. Arrays and slices do not have this method because they are not of fixed length.

partition

The partition method divides an iterator's items among two collections, using a closure to decide where each item belongs.

Whereas collect requires its result type to implement FromIterator, partition instead requires std::default::Default and std::default::Extend.

Collections

Strings and Text

Input and Output

Concept: All I/O in Rust is organized around 4 traits, owned by std::io:

  1. Read: Defines methods for byte-oriented input. Implementers are called readers.
  2. BufRead: Includes Read methods, plus methods for reading lines of text and so forth. Implementers are called buffered readers.
  3. Write: Defines methods for both byte-oriented and UTF-8 text output. Implementers are called writers.

Shortcut: All 4 traits are so commonly used that they can there's a prelude module containing only them. Just add:


#![allow(unused)]
fn main() {
use std::io::prelude::*;
}

Readers and Writers

One of the simplest, most low-level implementation of both Read and Write is a function that copies data from any reader to any writer:


#![allow(unused)]
fn main() {
use std::io::{self, Read, Write, ErrorKind};

const DEFAULT_BUF_SIZE: usize = 8 * 1024;

fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W)
    -> io::Result<u64>
    where R: Read, W: Write
{
    let mut buf = [0; DEFAULT_BUF_SIZE];
    let mut written = 0;

    loop {
        let len = match reader.read(&mut buf) {
            Ok(0) => return Ok(written),
            Ok(len) => len,
            Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
            Err(e) => return Err(e),
        };
        writer.write_all(&buf[..len])?;
        written += len as u64;
    }
}
}

Shortcut: The import statement use std::io::{self}; declares io as an alias to the std::io module, which means we can write things like std::io::Result as just io::Result.

Readers

All main methods defined by Read take the reader itself by mut reference. There are also four adapter methods that take the reader by value and transform it into an iterator or a different reader.

Note that there is no method for closing a reader. Readers and writers implement Drop, so they are closed automatically.


Reader Method reader.read(buffer)


#![allow(unused)]
fn main() {
fn read(&mut self, buf: &mut [u8]) -> Result<usize>
}

Reads an undefined number of bytes from the data source and stores them in the given buffer. The usize success value is the number of bytes read, which might be less than or equal to buffer.len(), even if there's still more data to read.

If read returns Ok(0), theres no more input to read.

On error, read returns Err(err), where err is an io::Error value. io::Errors are printable for humans. For computers, you should use the .kind() method, which returns an error code of type io::ErrorKind.

io::ErrorKind is an enum with lots of different types of errors. Most variants shouldn't be ignored because they indicate actual issues, but not all. io::ErrorKind::Interrupted corresponds to the EINTR UNIX error code, which means the signal was interrupted and can in almost all scenarios be ignored.


Reader Method reader.read_to_end(&mut byte_vec)


#![allow(unused)]
fn main() {
fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize>
}

Reads all remaining input from the reader into a vector.

There's no limit on the amount of data that read_to_end will return, so it's usually a good idea to impose a limit using .take().


Reader Method reader.read_to_string(&mut string)


#![allow(unused)]
fn main() {
fn read_to_string(&mut self, buf: &mut String) -> Result<usize>
}

Reads all remaining input from the reader into a string. If the source provides data that isn't valid UTF-8, read_to_string will return an ErrorKind::InvalidData error.


Reader Method reader.read_exact(&mut buf)


#![allow(unused)]
fn main() {
fn read_exact(&mut self, buf: &mut [u8]) -> Result<()>
}

Reads exactly enough data to fill the given buffer. If the reader runs out of data before reading buf.len() bytes, read_exact returns an ErrorKind::UnexpectedEof error.


Adapter reader.bytes()


#![allow(unused)]
fn main() {
fn bytes(self) -> Bytes<Self> where Self: Sized
}

Converts a reader into an iterator over the bytes of the input stream. The item types is io::Result<u8>, so an error check is required for every byte. It calls reader.read() one byte at a time, so this method is super inefficient if the reader isn't buffered.


Adapter reader.chars()


#![allow(unused)]
fn main() {
fn chars(self) -> Chars<Self> where Self: Sized
}

Converts a reader into an iterator over the input stream as UTF-8 characters.


Adapter reader.chain(reader2)


#![allow(unused)]
fn main() {
fn chain<R: Read>(self, next: R) -> Chain<Self, R> where Self: Sized
}

Creates a new reader that produces all of the input from reader, followed by all of the input from reader2.


Adapter reader.take(n)


#![allow(unused)]
fn main() {
fn take(self, limit: u64) -> Take<Self> where Self: Sized
}

Creates a new reader that reads from the same source as reader, but is limited to n bytes of input.

Buffered Readers

Buffered readers implement both Read and BufRead, which provides three main methods.

Come back and add the type signatures of the following methods.


Buffered Reader Method reader.read_line(&mut line)

Reads a line of text and appends it to line, which is of type String.

The method returns an io::Result<usize, io::Error>, where usize is the number of bytes read, including the line ending, if any.

If the reader is at the end of the input, line will be unchanged and the method will return Ok(0).


☆ Buffered Reader Method reader.lines()

Returns an iterator over the lines of the input.

The item type is io::Result<String, io::Error>. Newline characters are not included in the strings.


Buffered Reader Methods reader.read_until(stop_byte, &mut byte_vec) and reader.split(stop_byte)

Byte-oriented versions of .read_line() and .lines(). Produces Vec<u8> instead of Strings.

Reading Lines

We can use .lines() to create a function that implements the Unix grep utility. Our function receives a generic reader (ie anything that implements BufRead).


#![allow(unused)]
fn main() {
use std::io;
use std::io::prelude::*;

fn grep<R>(target: &str, reader: R)
    -> io::Result<()>
    where R: BufRead
{
    for line_result in reader.lines() {
        let line = line_result?;
        if line.contains(target) {
            println!("{}", line);
        }
    }
    Ok(())
}
}

In the case that we want to use stdin as our source of data, we have to convert it to a reader using its .lock() method like so:


#![allow(unused)]
fn main() {
let stdin = io::stdin();
grep(&target, stdin.lock())?; // ok
}

If we wanted to use our function with the contents of a file, we could do so like this:


#![allow(unused)]
fn main() {
let f = File::open(file)?;
grep(&target, BufReader::new(f))?; // also ok
}

Collecting Lines

Look at this.

Writers

To send output to a writer, use the write!() and writeln!() macros.


#![allow(unused)]
fn main() {
writeln!(io::stderr(), "error: world not helloable")?;
writeln!(&mut byte_vec, "The greated common divisor of {:?} is {}", numbers, d)?;
}

The write macros are the same as the print macros except for two differences:

  1. The write macros take an extra first argument, a writer.
  2. The write macros return a Result, so errors must be handled. When the print macros experience an issue, they simply panic.

The Write trait has these methods:


Writer Method writer.write(&buf)

Writes some of the bytes in the slice buf to the underlying stream.

Returns an io::Result<usize, io::Error>.

On success, gives the number of bytes written, which may be less than buf.len(), depending on the stream's mood.

This is the lowest-level method and is usually not used in practice.


Writer Method writer.write_all(&buf)

Writes all the bytes in the slice buf.

Returns Result<(), io::Error>.


Writer Method writer.flush()

Flushes any buffered data to the underlying stream.

Returns Result<(), io::Error>.


Warning: When a BufWriter is dropped, all remaining buffered data is written to the underlying writer. However, if an error occurs during this write, the error is ignored. To make sure errors don't get swallowed, always call .flush() on all buffered writers before dropping them.

Files

We've got two main ways to open a file:


File Method File::open(filename)

Opens an existing file for reading. It's an error if the file doesn't exist.

Returns an io::Result<File, io::Error>.


File Method File::create(filename)

Creates a new file for writing. If a file exists with the given filename, it gets truncated.

Returns an io::Result<File, io::Error>.


There is an altertive that uses OpenOptions to specify the exact open behavior we want.


#![allow(unused)]
fn main() {
use std::fs::OpenOptions;

// Create a file if none exists, or append to an existing one
let log = OpenOptions::new()
    .append(true)
    .open("server.log");

// Create a file, or fail if one with the specified name already exists
let new_file = OpenOptions::new()
    .write(true)
    .create_new(true)
    .open("new_file.txt")?;
}

Just like with readers and writers, you can add a buffer to a File if needed.

Term The method-chaining pattern seen with OpenOptions is called a builder in Rust.

Seeking

Files also implement the Seek trait, which means you can hop around within a File rather than reading or writing in a single pass from the beginning to the end.

Seek is defined like this:


#![allow(unused)]
fn main() {
pub trait Seek {
    fn seek(&mut self, pos: SeekFrom) -> io::Result<u64>;
}

pub enum SeekFrom {
    Start(u64),
    End(i64),
    Current(i64),
}
}

Seeking within a file is slow.

Other Reader and Writer Types

Add notes about common types of readers and writers.

Handy Readers and Writers

The std::io offers a few function that return trivial readers and writers.


io::sink()

No-op writer. All the write methods return Ok and the data is discarded.


io::empty()

No-op reader. Reading always succeeds and returns end-of-input.


io::repeat(byte)

Creates a reader that repeats the given byte endlessly.

Binary Data, Compression, and Serialization

Go here for some crate recommendations.

Files and Directories

OsStr and Path

Rust strings are always valid Unicode. Filenames are almost always Unicode.

To solve the Unicode issue, Rust provides std::ffi::OsStr and std::ffi::OsString.

std::ffi::OsStr

OsStr is a string type that's a subset of UTF-8. It's sole purpose is to represent all filenames, CLI arguments, and environment variables on all systems.

std::path::Path

Path is exactly like OsStr, but it provides a bunch of handy filename-related methods.

When to use which?

For absolute and relative paths, use Path. For an individual component of a path, use OsStr.


Owning types

For each string type, there's always a corresponding owning type that owns heap-allocated data.

String type | Owning type | Conversion method --|-- str | String | .to_string() OsStr | OsString | .to_os_string() Path | PathBuf | .to_path_buf()

All three of these string types implement a common trait, AsRef<Path>, which makes it easy to declare a generic function that accepts "any filename type" as an argument.


#![allow(unused)]
fn main() {
use std::path::Path;
use std::io;

fn open_file<P>(path_arg: P)
    -> io::Result<()>
    where P: AsRef<Path>
{
    let path = path_arg.as_ref();
    // ...
}
}

Path and PathBuf Methods

Path Method Path::new(str)


#![allow(unused)]
fn main() {
fn new<S: AsRef<OsStr> + ?Sized>(s: &S) -> &Path
}

Converts a &str or &OsStr to a &Path. The string doesn't get copied; the new &Path points to the same bytes as the original argument.


Path Method path.parent()


#![allow(unused)]
fn main() {
fn parent(&self) -> Option<&Path>
}

Returns the path's parent directory, if any. The path doesn't get copied; the parent directory of path is always a substring of path.


Path Method path.file_name()


#![allow(unused)]
fn main() {
fn file_name(&self) -> Option<&OsStr>
}

Returns the last component of path, if any.


Path Methods path.is_absolute() and path.is_relative()


#![allow(unused)]
fn main() {
fn is_absolute(&self) -> bool
fn is_relative(&self) -> bool
}

Tells you whether the path is absolute or relative.


Path Method path1.join(path2)


#![allow(unused)]
fn main() {
fn join<P: AsRef<Path>>(&self, path: P) -> PathBuf
}

Joins two paths. If path2 is an absolute path, it just returns a copy of path2.

Use Case: The path join method can be used to turn any path into an absolute path.


#![allow(unused)]
fn main() {
let abs_path = std::env::current_dir()?.join(any_path);
}

Path Method path.components()


#![allow(unused)]
fn main() {
fn components(&self) -> Components
}

Creates an iterator over the components of the given path, from left to right. The Item type of the iterator is std::path::Component, which is an enum:


#![allow(unused)]
fn main() {
pub enum Component<'a> {
    Prefix(PrefixComponent<'a>),
    RootDir,
    CurDir,
    ParentDir,
    Normal(&'a OsStr),
}
}

Converting Paths to Strings

Path Method path.to_str()


#![allow(unused)]
fn main() {
fn to_str(&self) -> Option<&str>
}

If path isn't valid UTF-8, this method returns None.


Path Method path.to_string_lossy()


#![allow(unused)]
fn main() {
fn to_string_lossy(&self) -> Cow<str>
}

Basically the same as to_str, but it'll always return a string regardless of whether or not the path is valid UTF-8. In the case the case that it's not valid, each invalid byte is replaced with the Unicode replacement character, �.

Path Method path.display()


#![allow(unused)]
fn main() {
fn display(&self) -> Display
}

Doesn't return a string, but it implements Display so that it can be used with print! macro and friends.

Filesystem Access Functions

Go here for some reference.

Reading Directories

To list the contents of a directory, use std::fs::read_dir, or the .read_dir() method of a Path:


#![allow(unused)]
fn main() {
use std::path;

for entry_result in path.read_dir()? {
    let entry = entry_result?;
    println!("{}", entry.file_name().to_string_lossy());
}
}

The read_dir method has the following type signature:


#![allow(unused)]
fn main() {
fn read_dir<P: AsRef<Path>>(path: P) -> Result<ReadDir>
}

A DirEntry is a struct with a few methods that have the following signatures:


#![allow(unused)]
fn main() {
struct DirEntry(_);

fn path(&self) -> PathBuf
fn metadata(&self) -> Result<Metadata>
fn file_type(&self) -> Result<FileType>
fn file_name(&self) -> OsString
}

Platform-Specific Features

The std::os module contains a bunch of platform-specific features, like symlink.

If you want code to compile on all platforms, with support for symbolic links on Unix, for instance, you must use #[cfg] in the program as well. In such cases, it's easiest to import symlink on Unix, while defining a symlink stub on other systems:


#![allow(unused)]
fn main() {
#[cfg(unix)]
use std::os::unix::fs::symlink;

// Stub implementation of symlink for platforms that don't have it
#[cfg(not(unix))]
fn symlink<P: AsRef<Path>, Q: AsRef<Path>>(src: P, _dst: Q) -> std::io::Result<()> {
    Err(io::Error::new(
        io::ErrorKind::Other,
        format!("can't copy symbolic link {}", src.as_ref().display())
    ))
}
}

There's a prelude module that can be used to enable all Unix extensions at once:


#![allow(unused)]
fn main() {
use std::os::unix::prelude::*;
}

Networking

For low-level networking code, start with the std::net module.

Go here for networking crate recommendations.

Concurrency

Macros

Unsafe Code