Attributes
Any item in a Rust program can be decorated with attributes, which are Rust's catch-all syntax for writing miscellaneous instructions and advice to the compiler.
Tip: To attach an attribute to a whole crate, add it at the top of the
main.rsorlib.rsfile, before any items and write#!instead of#:#![allow(unused)] fn main() { // src/lib.rs #![allow(non_camel_case_types)] pub struct weird_type_name { } }
Tip: To include a module only when testing, use
#[cfg(test)].
#! can also be used inside functions, structs, etc, but it's only typically used at the beginning of a file to attach an attribute to the whole module or crate.
Some attributes must use #! because they can only be applied to an entire module or crate. For example, #![feature] is used to turn on unstable features of the Rust language and libraries.
Conditional Compilation
Conditional compilation is configured using the #[cfg] attribute:
#![allow(unused)] fn main() { #[cfg(target_os = "macos")] mod mac_stuff; // will only be be included if the target is macOS }
Links
Cargo
Behavior: By default,
cargo buildlooks at the files in yoursrcdirectory and figures out what to build. When it seessrc/lib.rs, it knows that it needs to build a library.
Behavior: Cargo will automatically compile files inside
src/binwhen you runcargo build. The executables created from the files insrc/bincan be run usingcargo run --bin my_bin.
Useful Commands
| Command | Result |
|---|---|
rustup update | Updates Rust |
cargo new --bin <package_name> | Creates a new package |
cargo package --list | List all files included in a package |
cargo doc --no-deps --open | Create HTML documentation for your project; the output gets saved to target/doc |
Rust Tools
Automatic code formatting
$ rustup component add rustfmt-preview
Installs both rustfmt for formatting rust, and cargo-fmt for formatting Cargo configurations.
Run it with $ cargo fmt
Automatic code fixing and version migrations
Run rustfix with: $ cargo fix
Automatc code improvements
$ rustup component add clippy-preview
Install clippy.
Run it with: $ cargo clippy
Common Profile Settings
debug
Controls the -g option sent to rustc, which turns debug symbols on and off.
Possible values: true | false
Links
Recommended Creates
Binary Data, Compression, and Serialization
byteorder
Offers traits that add methods to all readers and writers for binary input and output.
flate2
Provides adapter methods for reading and writing gzipped data.
serde
Used for serialization; it converts back and forth between Rust structs and bytes.
Networking
mio
Support for asynchronous input and output to create high-performance servers. It provides a simple event loop and asynchronous methods for reading, writing, connecting, and accepting connections. (basically an asynchronous copy of the whole networking API)
tokio
Wraps the mio event loop in a futures-based API.
reqwest
Provides a beautiful API for HTTP clients.
iron
Higher-level server framework with support for things like middleware traits.
websocket
Implements the WebSocket protocol.
Translations
&T
Immutable reference to a value of type
T.
&[T]
Reference to a slice containing data of type
T.
impl<T> Queue<T>
For any type
T, here are some methods available onQueue.
fn say_hello(out: &mut Write)
This function's parameter is a mutable reference to any value that implements the
Writetrait
fn min<T: Ord>(value1: T, value2: T)
This function can be used with arguments of any type T that implements the
Ordtrait
fn top_ten<T: Debug + Hash + Eq>(values: &Vec<T>)
This function can be used with an argument that is a vector reference of any type
T, as long asTimplements theDebug,Hash, andEqtraits
impl<W: Write> WriteHtml for W
Here's an implementation of the
WriteHtmltrait for any typeWthat implementsWrite
trait Creature: Visible {
Every type that implements
Creaturemust also implement theVisibletrait.Creatureis a subtrait of (extends)Visible.
trait Iterator { type Item;
Itemis an associated type of theIteratortrait. Any type that implementsIteratormust specify theItemtype.
impl Iterator for Args { type Item = String;
The implementation of
IteratorforArgshas an associated Item type ofString.
fn dump<I>(iter: I) where I: Iterator<Item=String>
The type parameter
Imust be an iterator overStringvalues
trait Mul<RHS=Self> {
The type parameter
RHSof this trait defaults toSelf.
#![allow(unused)] fn main() { pub trait Rng { fn next_u32(&mut self) -> u32; } pub trait Rand: Sized { fn rand<R: Rng>(rng: &mut R) -> Self; } }
The
Randtrait uses theRngtrait as a bound.RandandRngare buddy traits.
impl<T> Add for Complex<T> where T: Add<Output=T>
Overloads the
+operator for values ofComplex<T>types, whereTmust already implement theAdd(+operator) trait.
trait PartialEq<Rhs: ?Sized = Self>
This is a trait signature whose
Rhstype parameter does not have to be a sized type. That means this trait could be implemented for types like&stror&[T]. We'd say thatRhsis questionably sized.
#![allow(unused)] fn main() { impl<T, E, C> FromIterator<Result<T, E>> for Result<C, E> where C: FromIterator<T> { ... } }
If you can collect items of type
Tinto a collection of typeC(whereCimplements theFromIterator<T>trait), then you can collect items of typeResult<T, E>into a single result of typeResult<C, E>.
A lot of Rust code is ostensibly terse and clean-looking (especially relative to C++). This is great, but as a Rust newbie, the implicit assumptions made by the compiler can seem a bit blackbox-y, which can make things difficult to reason about.
Here, I'll try to jot down examples of the aforementioned implicit decisions made by the compiler by comparing idiomatic code with it's fully-expressed, verbose syntax.
Return Type Omission
A function declaration whose return type is omited is shorthand for returning the unit type.
The code Under the hood fn my_fn() { .. }fn my_fn() -> () { .. }
Automatic Dereferencing (The
.Operator)The
.operator implicitly dereferences its left operand, if needed. e.g. for a reference variable namedsome_refof type&T, whereThas a field namedx:
The code Under the hood some_ref.x(*some_ref).x
Automatic Referencing (The
.Operator)The
.operator also implicitly borrows a reference to its left operand, if needed for a method call.#![allow(unused)] fn main() { let mut x = vec![1993, 1963, 1991]; let mut y = vec![1993, 1963, 1991]; x.sort(); (&mut y).sort(); assert_eq!(x, y); }
The code Under the hood v.sort()(&mut v).sort()
Reference Traversal (The
.Operator)
.will follow as many references as it takes to reach its target.#![allow(unused)] fn main() { struct Number { value: usize } let n = Number { value: 999 }; let r: &Number = &n; let rr: &&Number = &r; let rrr: &&&Number = &rr; assert_eq!(rrr.value, (*(*(*rrr))).value); }
The code Under the hood rrr.value(*(*(*rrr))).value
Reference Traversal (Comparison Operators)
Rust's comparison operators can also "see through" references, as long as both operands have the same type.
#![allow(unused)] fn main() { let x = 10; let y = 10; let rx = &x; let ry = &y; let rrx = ℞ let rry = &ry; assert!(rrx <= rry); assert!(*(*rrx) <= *(*rry)); }
The code Under the hood rrx <= rry*(*rrx) <= *(*rry)
Single Reference Parameter (Omitting Lifetime Parameters)
When a function takes a single reference as an argument, and returns a single reference, Rust assumes that the two must have the same lifetime.
The code Under the hood fn smallest(v: &[i32]) -> &i32fn smallest<'a>(v: &'a [i32]) -> &'a i32
No Return Reference (Omitting Lifetime Parameters)
When a function doesn't return any references, Rust doesn't need explicit lifetimes.
The code Under the hood fn sum_r_xy(r: &i32, s: S) -> i32fn sum_r_xy<'a, 'b, 'c>(r: &'a i32, s: S<'b, 'c>) -> i32
Single Lifetime (Omitting Lifetime Parameters)
If there's only a single lifetime that appears among a function's parameters, Rust assumes any lifetimes in the return value msut be that one.
The code Under the hood fn first_third(point: &[i32; 3]) -> (&i32, &i32)fn first_third<'a>(point: &'a [i32; 3]) -> (&'a i32, &'a i32)
Accepting
selfby Reference (Omitting Lifetime Parameters)If a function is an
implmethod on some type and that takes itsselfparameter by reference, Rust assumes thatself's lifetime is the one to give the method's return value.
The code Under the hood fn find_by_prefix(&self, prefix: &str) -> Option<&String>fn find_by_prefix<'a, 'b>(&'a self, prefix: &'b str) -> Option<&'a String>
Tuple
structConstructorsWhen defining a tuple-like
struct, Rust implicitly defines a function that acts as the type's constructor.
The code Under the hood struct Bounds(usize, usize)fn Bounds(x: usize, y: usize) -> Bounds
SelfInside
implblocks, Rust automatically creates a type alias of the type for which theimplblock is associated calledSelf.#![allow(unused)] fn main() { impl<T> Queue<T> { pub fn new() -> Self { .. } } }
The code Under the hood pub fn new() -> Self { .. }pub fn new() -> Queue<T> { .. }
Overloaded Operators
Overloaded operators (even the basic ones) are implicitly calling a method specified by their corresponding generic trait.
The code Under the hood x * yMul::mul(x, y)x += yx.add_assign(y)
The
?operatorUnder the hood, the
?operator callsFrom::fromon the error value to convert it to a boxed trait object, aBox<dyn error::Error>, which is polymorphic -- that means that lots of different kinds of errors can be returned from the same function because all errors act the same since they all implement theerror::Errortrait.
asyncLifetimes
asyncfunction lifetimes have different rules if one of its arguments is reference of is'non-static.This function:
#![allow(unused)] fn main() { async fn foo(x: &u8) -> u8 { *x } }Is this under the hood:
#![allow(unused)] fn main() { fn foo<'a>(x: &'a u8) -> impl Future<Output = u8> + 'a { async move { *x } } }Pinning
A lot happens under the hood when using
async. Let's look at this incomplete code, where we run two futures in sequence:#![allow(unused)] fn main() { let fut_one = /* ... */; let fut_two = /* ... */; async move { fut_one.await; fut_two.await; } }Under the hood, rust creates an anonymous type representing the
async { }block and its combined possible states:#![allow(unused)] fn main() { struct AnonAsyncFuture { fut_one: FutOne, fut_two: FutTwo, state: State, } enum AnonState { AwaitingFutOne, AwaitingFutTwo, Dont, } }Then it implements
Futurefor the anonymous type and provides apollmethod:#![allow(unused)] fn main() { impl Future for AnonAsyncFuture { type Output = (); fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<()> { loop { match self.state { State::AwaitingFutOne => match self.fut_one.poll(..) { Poll::Ready(()) => self.state = State::AwaitingFutTwo, Poll::Pending => return Poll::Pending, } State::AwaitingFutTwo => match self.fut_two.poll(..) { Poll::Ready(()) => self.state = State::Done, Poll::Pending => return Poll::Pending, } State::Done => return Poll::Ready(()), } } } } }When
pollis first called, it'll callfut_one's poll. If it's still pending, it'll return. Future calls topollwill pick up where the previous poll left off, based onstate.Rust Trivia
What are the 3 types used to represent a sequence of values, and what are their generic type annotations?
- Array
[T; N]- Vector
Vec<T>- Slice
&[T]How do you check the current size and capacity of any sequential-type value?
.len()and.capacity()What makes a
Stringand a&strunique? What are their effective underlying types?
- A
Stringis just aVec<u8>with the guarantee that the data is well-formed UTF-8.- A
&stris just a slice&[u8]of aString.Given
let x = y;, under what condition would it be true thatydid not become uninitialized?
If the type of
yimplements theCopytrait.Given
let s = "hello!";, what is the type ofs?
sis a&strwhose pointer refers to preallocated, read-only memory on the stack.How do you get the size of any data type?
std::mem::size_of::<T>();What's a fat pointer?
A fat pointer is a pointer to a slice (a region of an array or vector). It's a two-word value on the stack comprised of:
- A pointer to the slice's first element
- The number of elements in the slice
What's the difference between
ArcandRctypes?
Arc(atomic reference count) is safe to share between threads directly, whereas aRcuses faster non-thread-safe code to update its reference count.When defining a type (
struct), when is it required that a field's lifetime be specified?
Lifetimes must be specified when a field is a reference type. e.g.
struct RefPoint<'a, 'b> { x: &'a f64, y: &'b f64, }What risk is sometimes posed by using reference count types?
If two
Rctypes point to each other, they will keep each other's ref count above zero and neither will be freed. This is called a reference cycle.Given two reference variables
xandy, how do I check to see if they point to the same value in memory?
std::ptr::eq(x, y)What are the two ways for closures to get data from enclosing scopes?
- Moves
- Borrowing
What are the three variants of
IntoIteratorimplementations?
- Shared reference
- Mutable reference
- By value
When should you use either
PathorOsStr?
For absolute and relative paths, use
Path. For an individual component of a path, useOsStr.How are
boolvalues stored in memory, and why?
boolvalues are stored as a whole byte so that pointers to them may be created.What mechanism should you reach for to allow for shared ownership of a value?
Rc<T>orArc<T>(if sharing across multiple threads)What mechanism allows us to mutate the value inside of an
Rc<T>? What about anArc<T>?
For an
Rc<T>, interior mutability can be facilitated by aRefCell<T>. For anArc<T>, you'd reach for aMutex<T>.What special ability does a
Pin'd object have?
Pinned (i.e. immovable) objects can have pointers to their own fields. e.g.
#![allow(unused)] fn main() { struct MyFuture { a: i32, ptr_to_a: *const i32, // I point to my own `a` } }When would you use
ArcWake(from thefuturescrate) trait?
Use
ArcWakewhen you need an easy way to construct aWaker.What is the actual return type of this function?
#![allow(unused)] fn main() { async fn get_five() -> u8 { 5 } }
Returns value of type
impl Future<Output = u8>.How is an
asyncfunction in terms of lifetimes if one of its arguments is a > reference or non-'staticvalue?
Unlike regular functions,
asyncfunctions whose parameters are references or non-'static return aFuturewhich is bounded by the lifetime of the arguments. Meaning, the future returned from anasync fnmust be.awaited while its non-'static arguments are still valid.What's the workaround for
asyncfunctions' non-static lifetime rules?
An
asyncfunction'sFuturereturn value can be make'staticby moving the non-static (or reference) values into anasyncblock:#![allow(unused)] fn main() { fn work_around() -> impl Future<Output = u8> { async { let x = 5; borrow_x(&x).await } } }What's the formal fancy term to describe Rust's form of polymorphism?
What's the pattern used as a way to get around the orphan rule?
The newtype pattern, which involves creating a new type in a tuple struct.
I've implemented a newtype,
Wrapper, that wraps aVec<T>, but now I can't use theVec<T>'s built-in methods! What can I do?
Implement the
Dereftrait forWrapper, which would allow us to treatWrapperexactly like a Vec. Something about how passing by value cedes all ownership of a value. Use
dropas point of reference.How do I write an
impl Tfunction that consumesT(it will no longer be usable by others) and converts it toU?
In the function's signature, you'd have
selfbased by value, which will consume it. (usually, impl functions receive&self)How do I get the address of a value (say, a
String)?
#![allow(unused)] fn main() { let txt = String::from("hello world"); let txt_stack = &txt as *const _; // Address of pointer on the stack let txt_heap = &txt.as_bytes()[0] as *const _; // Address of first charcacter in heap dbg!((txt_stack, txt_heap)); }TODO: Go here and add stuff about the use of phantom/types/data
Patterns
Pattern type Example Notes Literal 100"name"Matches an exact value; the name of a constis also allowedRange 0 ... 100'a' ... 'z'Matches any value in range, including the end value Wildcard _Matches any value and ignores it Variable namemut countLike _but moves or copies the value into a new local variablerefvariableref fieldref mut fieldBorrows a reference to the matched value instead of moving or copying it Binding with subpattern val @ 0 ... 99ref circle @ Shape::Circle { .. }Matches the pattern to the right of @, using the variable name to the leftEnum pattern Some(value)NonePet::OrcaTuple pattern (key, value)(r, g, b)Struct pattern Color(r, g, b)Point { x, y }Card { suit: Clubs, range: n }Account { id, name, .. }Reference &value&(k, v)Matches only reference values Multiple patterns 'a' \| 'A'In matchonly (not valid inlet, etc.)Guard expressions x if x * x <= r2In matchonly (not valid inlet, etc.)Operator Overloading
Unary Operators
Trait Operator Equivalent std::ops::Neg-xx.neg()std::ops::Not!xx.not()Arithmetic Operators
Trait Operator Equivalent std::ops::Addx + yx.add(y)std::ops::Subx - yx.sub(y)std::ops::Mulx * yx.mul(y)std::ops::Divx / yx.div(y)std::ops::Remx % yx.rem(y)std::ops::AddAssignx += yx.add_assign(y)std::ops::SubAssignx -= yx.sub_assign(y)std::ops::MulAssignx *= yx.mul_assign(y)std::ops::DivAssignx /= yx.div_assign(y)std::ops::RemAssignx %= yx.rem_assign(y)Bitwise Operators
Trait Operator Equivalent std::ops::BitAndx & yx.bitand(y)std::ops::BitOr`x y` std::ops::BitXorx ^ yx.bitxor(y)std::ops::Shlx << yx.shl(y)std::ops::Shrx >> yx.shr(y)std::ops::BitAndAssignx &= yx.bitand_assign(y)std::ops::BitOrAssign`x = y` std::ops::BitXorAssignx ^= yx.bitxor_assign(y)std::ops::ShlAssignx <<= yx.shl_assign(y)std::ops::ShrAssignx >>= yx.shr_assign(y)Comparison Operators
Trait Operator Equivalent std::ops::PartialEqx == yx.eq(&y)std::ops::PartialEqx != yx.ne(&y)std::ops::PartialOrdx < yx.lt(y)std::ops::PartialOrdx > yx.gt(y)std::ops::PartialOrdx <= yx.le(y)std::ops::PartialOrdx >= yx.ge(y)Indexing Operators
Trait Operator Equivalent std::ops::Indexx[y]x.index(y)std::ops::Index&x[y]*x.index(y)std::ops::IndexMut&mut x[y]*x.index_mut(y)Utility Traits
Trait Description DropDestructors. Cleanup code that Rust runs automatically whenever a value is dropped. SizedMarker trait for types with a fixed size known at compile time, as oppose to types (such as slices) that are dynamically sized. CloneTypes that support cloning values. CopyMarker trait for types that can be cloned simply by making a byte-for-byte copy of the memory containing the value. Deref,DerefMutTraits for smart pointer types. DefaultTypes that have a sensible "default value". AsRef,AsMutConversion traits for borrowing one type of reference from another. Borrow,BorrowMutConversion traits like AsRefandAsMutthat additionally guarantee consistent hashing, ordering, and equality.From,IntoConversion traits for transforming one type of value into another. ToOwnedConversion trait for converting a reference to an owned value. Common Standard Library Iterators
Free Functions
Expression Notes std::iter::empty()Returns Noneimmediately.std::iter::once(5)Produces the given value, and then ends. std::iter::repeat("#9")Produces the given value forever.
std::ops::Range
Expression Notes 1..10Endpoints must be an integer type to be iterable. Range includes start value, and excludes end value.
std::ops::RangeFrom
Expression Notes 1..Unbounded iteration. Start must be an integer. May panic or overflow if the value reaches the limit of the type.
Option<T>
Expression Notes Some(10).iter()Behaves like a vector whose length is either 0 ( None) or 1 (Some(v)).
Result<T, E>
Expression Notes Ok("blah").iter()Similar to Option, producingOkvalues.
Vec<T>and&[T]
Expression Notes `` TODO
Stringand&str
Expression Notes `` TODO
std::collections::{HashMap, BTreeMap}
Expression Notes `` TODO
std::collections::{HashSet, BTreeSet}
Expression Notes `` TODO
std::sync::mpsc::Receiver
Expression Notes `` TODO
std::io::Read
Expression Notes `` TODO
std::io::BufRead
Expression Notes `` TODO
std::fs::ReadDir
Expression Notes std::fs::read_dir(path)Produces directory entries.
std::net::TcpListener
Expression Notes listener.incoming()Produces incoming network connections. Filesystem Access Functions
The following are some of the functions in
std::fsand their approximate Unix equivalents. All of these functions returnio::Resultvalues. All of these functions call out directly to the operating system.Creating and deleting
Unix Function Returns mkdircreate_dir(path)Result<()>mkdir -pcreate_dir_all(path)Result<()>rmdirremove_dir(path)Result<()>rm -rremove_dir_all(path)Result<()>unlinkremove_file(path)Result<()>Copying, moving, and linking
Unix Function Returns cp -pcopy(src_path, dest_path)Result<u64>renamerename(src_path, dest_pathResult<()>linkhard_link(src_path, dest_path)Result<()>Inspecting
Unix Function Returns realpathcanonicalize(path)Result<PathBuf>statmetadata(path)Result<Metadata>lstatsymlink_metadata(path)Result<Metadata>lsread_dir(path)Result<ReadDir>readlinkread_link(path)Result<PathBuf>Permissions
Unix Function Returns chmodset_permissions(path, perm)Result<()>Random Musings and Stuff I Should Remember
Important Crates
syn- parsing libraryUsually used for writing procedural macros, typically in conjunction with
quoteandproc-macro.
tokioPlus all of these crates that fall under the umbrella of
tokio:
hyper: A fast and correct HTTP/1.1 and HTTP/2 implementation for Rust.
tonic: A gRPC over HTTP/2 implementation focused on high performance, interoperability, and flexibility.
warp: A super-easy, composable, web server framework for warp speeds.
tower: A library of modular and reusable components for building robust networking clients and servers.
tracing(formerly tokio-trace): A framework for application-level tracing and async-aware diagnostics.
rdbc: A Rust database connectivity library for MySQL, Postgres and SQLite.
mio: A low-level, cross-platform abstraction over OS I/O APIs that powers tokio.
bytes: Utilities for working with bytes, including efficient byte buffers.
loom: A testing tool for concurrent Rust codeOther Thoughts
=does not mean assignment!Stop thinking of
let x = y;as "assign the value ofytox". Rather, think of it as "Move the value in memory owned byytox(thereby givingxownership)",Simple Operations and How-Tos
How do I...
use a range as a match subpattern?
#![allow(unused)] fn main() { let age = 27; match age { 0 => println!("I'm not born yet I guess"), n @ 1 ... 12 => println!("I'm a child of age {:?}", n), n @ 13 ... 19 => println!("I'm a teen of age {:?}", n), n => println!("I'm an old person of age {:?}", n), } }convert a number to a string?
The
std::string::ToStringis automatically implemented for any type which implements theDisplaytrait. This includes all machine (including number) types.#![allow(unused)] fn main() { let i = 5; let five = i.to_string(); assert_eq!(five, "5"); }convert a string to a number?
#![allow(unused)] fn main() { let num = "10".parse::<i32>().unwrap(); assert_eq!(10, num); }combine two strings?
#![allow(unused)] fn main() { fn greet(name: &str) { function_with_str_arg(&format!("Hello, {}!", name)); } }create a buffered reader from a
Fileor other unbuffered type that implementsRead?To create a buffered reader for a
File, do this:#![allow(unused)] fn main() { BufReader::new(reader) }If you need to also set the size of the buffer, use this instead:
#![allow(unused)] fn main() { BufReader::with_capacity(size, reader) }create a buffered writer from a
Fileor other unbuffered type that implementsWrite?To create a buffered writer for a
File, do this:#![allow(unused)] fn main() { BufWriter::new(file) }If you need to also set the size of the buffer, use this instead:
#![allow(unused)] fn main() { BufWriter::with_capacity(size, writer) }convert an iterator over
Result<T>into an iterator overT?Assume we're reading lines from a reader and want to collect the lines into a vector of strings. We can do so like this, which will create a value of type
Vec<T>:#![allow(unused)] fn main() { let lines = reader.lines().collect::<io::Result<Vec<String>>>()?; }define a generic function whose argument is any filename type?
All three string types implement a common trait,
AsRef<Path>, which makes it easy to declare a generic function that accepts "any filename type":#![allow(unused)] fn main() { use std::path::Path; use std::io; fn open_file<P>(path_arg: P) -> io::Result<()> where P: AsRef<Path> { let path = path_arg.as_ref(); // ... } }list the contents of a directory?
#![allow(unused)] fn main() { for entry_result in path.read_dir()? { let entry = entry_result?; println!("{}", entry.file_name().to_string_lossy()); } }Using type parameter... as runtime code?
pub trait DeviceCommunicationManagerCreator: Send { fn new(sender: Sender<DeviceCommunicationEvent>) -> Self; } fn add_comm_manager<T>(&self) -> Result<(), ButtplugServerStartupError> where T: 'static + DeviceCommunicationManager + DeviceCommunicationManagerCreator, { let mgr = T::new(self.sender.clone()); ... }Notes taken from the OG of Rust learning resources.
Chapter 16: Fearless Concurrency
Using Threads to Run Code Simultaneously
Waiting for all threads to finish
Spawning a thread returns a
JoinHandle, which is an owned value that exposes thejoinmethod. When called,joinblocks the calling (current) thread until the handled thread completes.Using
moveClosures with threadsA
moveclosure allows data to be used from one thread in another thread. It moves ownership of the data used to the thread's environment.Using Message Passing to Transfer Data Between Threads
Rust's major abstraction for accomplishing message-sending concurrency is the channel.
A channel has two halves: a transmitter and a receiver. Rust's implementation allows for multiple producers and a single receiver/consumer, hence mpsc.
A channel is said to be closed if either half (sender or receiver) of a channel is dropped.
In the below code, we'll spawn a new thread that says hello to the main thread:
#![allow(unused)] fn main() { use std::thread; use std::sync::mpsc; use std::time::Duration; let (tx, rx) = mpsc::channel(); thread::spawn(move || { let value = String::from("hello"); tx.send(value).unwrap(); }); let received = rx.recv().unwrap(); println!("Transmitter said {}", received); }Some things to note from the example:
tx.send()returns aResultbecause it's possible that the receiving end has already been dropped.tx.send(value)steals ownership!valuewill no longer be usable.- Once a sender thread has finished, calling
recv()in the main thread will return anErrorresult, which indicates that no more values will be coming down from the receiver.Tip: Instead of
recv, you can also usetry_recv, which does not block the receiving thread.Sending Multiple Values (Proof of Concurrency!)
Tweaking the example above, the sender will send multiple values:
#![allow(unused)] fn main() { use std::thread; use std::sync::mpsc; use std::time::Duration; let (tx, rx) = mpsc::channel(); thread::spawn(move || { let vals = vec![ String::from("hello"), String::from("from"), String::from("the"), String::from("thread"), ]; for val in vals { tx.send(val).unwrap(); thread::sleep(Duration::from_millis(100)); } }); // A receiver is an Iterator! for recvd in rx { println!("Sender said {}", recvd); } }Sending Multiple Values from Multiple Transmitters
Senders can be cloned such that we can listen to messages from multiple threads:
#![allow(unused)] fn main() { use std::thread; use std::sync::mpsc; use std::time::Duration; let (tx1, rx) = mpsc::channel(); let tx2 = mpsc::Sender::clone(&tx1); thread::spawn(move || { let vals = vec![ String::from("Thread 1: hello"), String::from("Thread 1: from"), String::from("Thread 1: thread"), String::from("Thread 1: UNO"), ]; for val in vals { tx1.send(val).unwrap(); thread::sleep(Duration::from_millis(100)); } }); thread::spawn(move || { let vals = vec![ String::from("Thread 2: hello"), String::from("Thread 2: from"), String::from("Thread 2: thread"), String::from("Thread 2: DOS"), ]; for val in vals { tx2.send(val).unwrap(); thread::sleep(Duration::from_millis(50)); } }); // A receiver is an Iterator! for recvd in rx { println!("Sender said {}", recvd); } }Shared-State Concurrency
Above we saw concurrency via message communication. Now we'll look at concurrency via shared memory.
Using Mutexes to Allow Access to Data from One Thread at a Time
Concept: A mutex is a mechanism that allows only one thread to access data at any given time. To access data, a thread has to signal that it wants the mutex's lock. When it's done, it have to give the lock back to allow other threads to access the data.
Warning: A mutex cannot protect you from deadlocks! A deadlock occurs when an operation needs to lock two resources and two threads have each acquired one of the locks, causing them to wait for each other forever.
In the below example, we'll use a super simple mutex and comment the different aspects of its use:
#![allow(unused)] fn main() { use std::sync::Mutex; let m = Mutex::new(5); // We'll wrap this in an inner scope so that the lock will be dropped, // allowing others to use it { let mut val = m // Get the lock. NOTE: This method blocks! .lock() // In rust, the Result returned by lock contains the actual data, // wrapped in a MutexGuard .unwrap(); // The data itself is a smart pointer! *val += 1; } println!("m = {:?}", m); }Concept: A call to
lock()will fail if another thread hold the lock has panicked. Once this happens, the mutex will never be free. When a mutex is in such a state, it's said that the mutex is poisoned.Sharing a Mutex
Between Multiple Threads In this example, we share a number behind a mutex among 10 threads. Each thread will increment the number.
use std::sync::{Mutex, Arc}; use std::thread; fn main() { let counter = Arc::new(Mutex::new(0)); let mut handles = vec![]; for i in 0..10 { let counter = Arc::clone(&counter); handles.push(thread::spawn(move || { println!("Handle {} is running!", i); let mut val = counter.lock().unwrap(); *val += 1; })); } handles.into_iter().for_each(|h| h.join().unwrap()); println!("Result: {}", counter.lock().unwrap()); }Notes taken from the book "Microservices with Rust".
3 - Logging & Configuring Microservices
Almost the entire ecosystem of loggin in rust is based on the
logcrate.Hint: For logging something that requires an otherwise-expensive operation, wrap it using the
log_enabled!(<LogLevel>)macro._Come back to this chapter: page 54
Come back to this chapter. Especially as a reference!
Go look at the workspace member
/fun-with-futures.5 - Understanding Asynchronous Operations with Futures
Warning: The book uses the term reactor, which is now referred to as an executor in the modern futures crates.
Pattern Reactor + Promises: A reactor allows a developer to run multiple activities in the same thread, while a promise represents a delayed result that will be available later. A reactors keeps a set of promises and continues to poll until it is completed and the result is returned.
The Basic Types of
futures
FutureStreamSinkBackground Tasks and Thread Pools in Microservices
(skipping earlier sections of the chapter)
Actix
The main types & traits of
actix:
SystemType: Maintains the actors system. Must be created before any other actors are spawned.Systemis itself an actor.ActorTrait: Anything that implementsActorcan be spawned.ArbiterType: AnArbiteris an event loop controller. Can only have one per thread.ContextType: EveryActorworks in aContext, which, to the runtime, represents an actor's environment. Can be used to spawn other tasks.AddressType: Every spawnedActorhas anAddress, which can be used to send things to and from a targeted actor.MessageTrait: Types that implementMessagecan be sent thorugh a type that implementsAddress'ssendmethod.Messagehas an associated type,Result, which is the type of value that will be returned after the message is processed.HandlerTrait: Implemented onActors and enables/facilitates the actor' message-handling functionality.11 - Involving Concurrency with Actors and the Actix Crate
Notes taken from the official Rust async book.
Why Async?
Pros
Asynchronous allows concurrent operation on the same thread. Multi-threaded code requires a lot of overhead and resources, even with minimal implementations.
Cons
Threads are natively supported and managed by the operating system, whereas async code is a language-specific implementation. Using async code also involves more complexity.
async/.awaitPrimer
asynctrnasforms a block of code into a state machine that implements theFuturetrait. A blockedFuturewill yield control of the thread.Concept: Async code can only run via the use of an executor. Invoking an async function will do nothing if its
Futureis not given to an executor likeblock_on.
block_onis the simplest executor. Others have more complex behavior, like scheduling multiple futures. Note it does block the current thread.use futures::executor::block_on; async fn hello_async() { println!("Hello, async!"); } fn main() { let future = hello_async(); // this will do nothing but return the Future block_on(future); }Under the Hood
The
FutureTraitA
Futurerepresents an asynchronous computation that can produce a value, with thepollfunction being at the heart of its mechanics. Thepollfunction drives the future as far towards completion as possible.A simplified version might look like this:
#![allow(unused)] fn main() { trait SimpleFuture { type Output; fn poll(&mut self, wake: fn()) -> Poll<Self::Output>; } enum Poll<T> { Ready(T), // Returned when the SimpleFuture has completed Pending, // Otherwise this } }If
pollreturnsPending, it arranges for thewakefunction to be called when theFutureis ready to make more progress. Whenwakeis called, the executor driving theFuturewill callpollagain so that theFuturecan make moar progress.The purpose of the
wakecallback is to tell the executor when a future can make progress. Without it, the exeuctor would have to be constantly polling.The real
Futuretrait is slightly different:#![allow(unused)] fn main() { trait Future { type Output; fn poll( self: Pin<&mut Self>, // Stuck here forever cx: &mut Context<'_>, ) -> Poll<Self::Output>; } }There are two key differences:
- The future is
Pin'd.- The
wakefunction pointer is nowContext. Using just a function pointer as before means we couldn't tell an executor which Future calledwake. Context fixes that by providing access to aWaker, which wakes a specific task.Pinned objects can store pointers to their own fields.
Task Wakeups with
WakerIts the role of a
Wakerto tell an executor that its future is ready to make more progress, via thewakefunction that it provides.When
wakeis called, the task executor knows to poll the future again at the next available opportunity.
Wakers implementCloneand can be copied around and stored.Build a Timer
To get started, we'll need these imports:
#![allow(unused)] fn main() { use std::{ future::Future, pin::Pin, sync::{Arc, Mutex}, task::{Context, Poll, Waker}, thread, time::Duration, }; }We start by just defining the future type, which needs a way for the thread to communicate that the timer has elapsed and the future should complete, for which we'll use a shared
Arc<Mutex<..>>.#![allow(unused)] fn main() { pub struct TimerFuture { shared_state: Arc<Mutex<SharedState>>, // ^ the arc + mutex enables communication between thread and future } // This is the state shared by the waiting thread and future struct SharedState { // Whether the sleep time has elapsed completed: bool, // This is the waker for the task that `TimerFuture` is running on. // The thread can use this after setting `completed = true` to tell // `TimerFuture`'s task to wake up and move forward. waker: Option<Waker>, } }Now the implementation:
#![allow(unused)] fn main() { impl Future for TimerFuture { type Output = (); fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> { // Check the state to see if we've already completed let mut shared_state = self.shared_state.lock().unwrap(); if shared_state.completed { Poll::Ready(()) } else { // Set waker so that the thread can wake up the current task when the // timer has completed, ensuring that the future is polled again and sees // that completed is true. // // We have to set the waker of shared_state on each poll because TimerFuture // can move between tasks on the executor // TODO: Figure out what that sentence actually means shared_state.waker = Some(cx.waker().clone()); Poll::Pending } } } }And finally the API for constructing a
TimerFutureand starting the thread:#![allow(unused)] fn main() { // Now we actually implement the timer thread impl TimerFuture { pub fn new(duration: Duration) -> Self { let shared_state = Arc::new(Mutex::new(SharedState { completed: false, waker: None, })); // Spawn it let thread_shared_state = shared_state.clone(); thread::spawn(move || { thread::sleep(duration); let mut shared_state = thread_shared_state.lock().unwrap(); // Signal that the timer has finished and wake up the last task // on which the future was polled, if there is one. // Remember, the `shared_state.waker` is being set inside the `poll` // function of `TimerFuture`. shared_state.completed = true; if let Some(waker) = shared_state.waker.take() { waker.wake() } }); TimerFuture { shared_state } } } }Figure out what "However, the
TimerFuturecan move between tasks on the executor, which could cause a stale waker pointing to the wrong task" actually means.Applied: Build an Executor
Concept: A future executor takes a set of top-level
Futures and runs them to completion by calling theirpollfunctions whenever the future is able to make progress.Term: A task is just a future that can reschedule itself, usually paired with a sender so that it can requeue itself in the executor.
The process looks a bit like this:
- Executor sends tasks that needs to be run over a channel
- An executor will
pollits futures once to get things started- A task will then call
wake(), which schedules itself to be polled again by putting itself back onto the chanenl- The executor puts the woken-up future onto a queue, and
pollis called againIn this process, the the executor itself only needs the receiving end of the task channel. The user of the executor will get a sending end so that new futures can be spawned.
Let's create an executor for our timer. We'll need to use the
ArcWaketrait, which provides an easy way to construct aWaker. These are the imports we'll need, in addition to those we used with the timer future implementation section:#![allow(unused)] fn main() { use { futures::{ future::{BoxFuture}, task::{waker_ref, ArcWake}, }, std::sync::mpsc::{sync_channel, Receiver, SyncSender}, }; }The executor will work by sending tasks to run over a channel. It'll pull events off of the channel and run them.
#![allow(unused)] fn main() { /// Task executor that receives tasks from a channel and runs them struct Executor { ready_queue: Receiver<Arc<Task>>, } /// This spawns new futures onto the task channel #[derive(Clone)] struct Spawner { task_sender: SyncSender<Arc<Task>>, } /// A Task is a future that can reschedule itself to be polled by an Executor struct Task { // Contains and in-progress future that needs to be pushed to completion. // The `Mutex` is here to prove to Rust that this is thread-safe. future: Mutex<Option<BoxFuture<'static, ()>>>, // Handle to place the task itself back onto the task queue. task_sender: SyncSender<Arc<Task>>, } fn new_executor_and_spawner() -> (Executor, Spawner) { let (task_sender, ready_queue) = sync_channel(10_000); (Executor { ready_queue }, Spawner { task_sender }) } }Let's also create add a method to
Spawnerthat makes it easy to spawn new futures.#![allow(unused)] fn main() { impl Spawner { fn spawn(&self, future: impl Future<Output = ()> + 'static + Send) { let task = Arc::new(Task { future: Mutex::new(Some(Box::pin(future))), task_sender: self.task_sender.clone(), }); self .task_sender .send(task) .expect("Too many tasks are queued!"); } } }Now we need to implement a
Waker(usingArcWake) for ourTask, which will be responsible for scheduling a task to be polled again after wake is called.Remember:
Wakers have to specify which task has become ready.#![allow(unused)] fn main() { impl ArcWake for Task { fn wake_by_ref(arc_self: &Arc<Self>) { // Implement `wake` by sending this task back onto the task channel // so that it'll be polled again by the executor. let cloned = arc_self.clone(); arc_self .task_sender .send(cloned) .expect("Too many tasks are queued!"); } } }So now, when we create a waker, calling
wakeon it will send a copy of theArcto be sent into the task channel.Last step is to tell our
Executorhow to pick up the task and poll it.#![allow(unused)] fn main() { impl Executor { fn run(&self) { while let Ok(task) = self.ready_queue.recv() { // Take the future, and if it has not completed yet (is still Some), // poll it in an attempt to complete it. let mut future_slot = task.future.lock().unwrap(); if let Some(mut future) = future_slot.take() { // Create a `LocalWaker` from the task itself let waker = waker_ref(&task); let context = &mut Context::from_waker(&*waker); if let Poll::Pending = future.as_mut().poll(context) { // This future isn't done yet, so put it back in its task to be run again later *future_slot = Some(future); } } } } } }FINALLY, we can run it:
fn main() { let (executor, spawner) = new_executor_and_spawner(); // Spawn a task to print before and after waiting on a timer spawner.spawn(async { println!("Wait for itttt...."); TimerFuture::new(Duration::new(2, 0)).await; println!("NOW!"); }); // Drop the spawner so that our executor knows it is finished // and won't receive anymore tasks to run. drop(spawner); // Run the executor until the task queue is empty executor.run(); }
async/.awaitThere are only two ways to use
async.
asyncfunctionsasyncblocksBoth means return a value that implements the
Futuretrait. The following functions return the same type:#![allow(unused)] fn main() { async fn get5() -> u8 { 5 } fn get5() -> impl Future<Output = u8> { async { 5 } } }
asyncLifetimesUnlike regular functions,
asyncfunctions whose parameters are references or non-'static return aFuturewhich is bounded by the lifetime of the arguments. Meaning, the future returned from anasync fnmust be.awaitwhile its non-'static arguments are still valid.
async move
async moveworks just likemoveblocks used with closures.
.awaiting on a Multithreaded Executor
Futures can move freely between threads, so any value used inasyncstuff must be of a type that is also able to travel between threads (i.e. the type must implementSend).Pinning
Why Pinning
Pinworks in tandem with its BFF,Unpin.Concept: Pinning makes it possible to guarantee than an object which implements
!Unpinwon't ever be moved.See the Pinning block of the hidden code page for a deeper explanation.
Streams
The
StreamTraitThe
Streamtrait is basically the love-child ofFutureandIterator:#![allow(unused)] fn main() { trait Stream { // Yielded type type Item; // Attempt to resolve the next item in the stream. fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>>; } }Iteration and Concurrency
Book Notes
These pages correspond to the chapters of notes I've taken while reading this book. So far it's been an excellent resource and I'd recommend it to anyone learning Rust.
Basic Types
Integer Types
u8is used to represent single-byte values.Characters are distinct from the numeric types (unlike C++); a
charis neither au8, nor ani8.Values used as array access indices must be
usize. The same applies to values that represent the size of arrays or vectors.Integer literals can take a suffix indicating their type. The suffix can optionally be seperated by an underscore. eg:
42u8is au8value1729isizeand1729_isizeare bothisizeCompiler behavior: When infering a numeric type, the compiler will tend to favor inferring
i32.The following prefixes can be used with numeric literals to specify their radix:
0xhexadecimal0ooctal0bbinaryLong numeric literals may be segmented by underscores for readability, eg:
4_295_923_000_010or0xffff_0f0f.Rust provides byte literals, which are character-like literals for
u8values:b'X'represents the ASCII code for the character X, but as au8value.You can convert from one integer type to another using the
as(type-cast) operator:65535_u16 as i32Floating-Point Types
The fraction part of a floating-point type may consist of a lone decimal point:
5.is a valid float constant.Compiler behavior: Given a floating-point number, the compiler will infer a type of
f64.The
boolType
boolvalues can be converted toi##types using theasoperator:#![allow(unused)] fn main() { assert_eq!(false as i32, 0); assert_eq!(true as i32, 1); }But, the inverse is not true. The
asoperator can't convert numeric types tobool. You have to be more explicit by using a comparison:x != 0Rust uses an entire byte for a
boolvalue in memory, so you can create a pointer to it.Characters
Rust's character type
charrepresents a single Unicode character, as a 32-bit value.
chars represent a single character in isolation. Whereas strings and streams of text use UTF-8 encoded bytes. This means theStringtype represents a sequence of UTF-8 bytes, notchars.A
charliteral is just a single Unicode character wrapped in single quotes, e.g.'©'.The
asoperator can be used to convertcharto an integer type (i32,u16, etc), but the opposite is only true foru8types. For others, usestd::char::from_(integer type).Tuples
Tuple elements cannot be accessed using dynamic indices. That is to say, given tuple
t, I can't use variableito access theith element.Term: The type definition
()is called the unit type.Rust uses the unit type where there's no meaningful value to carry, but the context still demands us to define a type. e.g. a function that returns no value has a return type of
().Shorthand: A function declaration whose return type is ommited is shorthand for returning the unit type. e.g.
fn my_fn();is shorthand forfn my_fn() -> ();.Trailing commas are acceptable in tuples. They're acceptable pretty much anywhere in Rust.
Pointer Types
Pointers in Rust are much more performant and memory-efficient than they are in GCed languages.
References
&is the immutable reference operator. It creates the reference.
&mutis the mutable reference operator.
*is the dereference operator. It accesses the value being referred to.The type
&Tis pronounced "ref T", meaning "reference to a value of typeT".The expression
&xcreates a reference to valuex. In words, we'd say that it "borrows a references tox".The expression
*x(given thatxis of type&T) refers to the value thatxis a reference to.References are immutable by default. For a reference to be mutable, it must have type
&mut T.Pointers in Rust can never be null. There are no pointer exceptions.
Boxes
Boxs are references whose referent is allocated directly in the heap.When a
Boxis created, enough memory is allocated on the heap to contain its value:#![allow(unused)] fn main() { let v = vec![1, 2, 3, 4]; let b = Box::new(v); // allocated space on the heap to hold v }When a
Boxreference goes out of scope, both itself and the value it refers to in the heap are freed.Raw Pointers
Raw pointers are only used in
unsafecode.
Arrays, Vectors, and Slices
Rust has 3 types for representing a sequence of values.
Name Type Description Size Memory Array [T; N]Array of Nvalues, each of typeTFixed Stack Vector Vec<T>Vector of TsDynamic Heap Slice &[T]Shared slice of TsFixed Stack (as pointer to heap value) Given any of the above types as value
v, the expressionv.len()gives the number of elements inv, andv[i]refers to thei'th element ofv.imust be of typeusize; no other integer types will work as an index.Arrays
An array's length is built into its type and is fixed at compile time.
Implicit behavior: When working with an array value and accessing its methods, Rust implicitly converts a reference to an array to a slice. So if you need to know the methods for an array, go look at the methods for slices.
Vectors
A vector is allocated on the heap.
There are 5 main ways to create a vector:
- Use the
vec!macro (simplest)- Build a vector by repeating a given value a certain number of times using a syntax that imitates array literals:
#![allow(unused)] fn main() { let rows = 100; let cols = 100; let pixel_buffer = vec![0; rows * cols]; println!("Buffer is {} bytes long.", pixel_buffer.len()) }
- Using
Vec::newto create a new, empty vector, and pushing elements onto it.#![allow(unused)] fn main() { let mut v = Vec::new(); v.push("hello"); v.push("vector"); println!("{:?}", v); println!("capacity: {}", v.capacity()); }
- Iterators produce vectors when executed (using their
.collect()method):#![allow(unused)] fn main() { let v: Vec<i32> = (1..4).collect(); assert_eq!(v, [1, 2, 3]); }
- If you know the size of the vector in advance, you can use
Vec::with_capacityto create the vector, instead ofnew:#![allow(unused)] fn main() { let mut v = Vec::with_capacity(); v.push("hello"); v.push("vector"); println!("{:?}", v); }Using
Vec::with_capacityinstead ofVec::newis more performant because it can prevent costly heap reallocations when a vector grows beyond its current capacity.A vector's
capacity()method returns the number of elements the vector could hold without reallocation.#![allow(unused)] fn main() { // Track the length and capacity of a vector as values are added to it let mut v: Vec<i32> = Vec::with_capacity(2); println!("length/capacity: {}/{}", v.len(), v.capacity()); v.push(1); v.push(2); println!("length/capacity: {}/{}", v.len(), v.capacity()); v.push(3); println!("length/capacity: {}/{}", v.len(), v.capacity()); }As with arrays, slice methods can be used on vectors.
In stack memory, a
Vec<T>consists of three values:
Stack cell Stack cell Stack cell Pointer to heap-allocated buffer The capacity of the buffer The current occupied size of the buffer Inserting and removing vectors vectors from anywhere but the end of a vector is expensive.
Slices
A slice, written
[T](without specifying the length), is a region of an array or vector.Since a slice can be any length, they can't be stored directly in variables or passed as function arguments; they are always passed by reference.
A reference to a slice is a fat pointer.
Term: A fat pointer is a two-word value on the stack comprised of
- A pointer to the slice's first element
- The number of elements in the slice
Whereas an ordinary reference is a non-owning pointer to a single value, a reference to a slice is a non-owning pointer to several values.
A slice is (maybe?) a psuedo-generic for any sequential data type.
You can get a reference to a slice of an array, vector, or another slice by indexing it with a range:
#![allow(unused)] fn main() { let v: Vec<f64> = vec![1., 2., 3.]; // println!() }The term slice is often used for reference types like
&[T]or &str, but that's just shorthand. Those types are called references to slices.String Types
String Literals
String literals are enclosed in double quotes.
Term: Rust offers raw strings that don't require backslashes or explicit inclusion of whitespace. They're similar to template string in Javascript.
#![allow(unused)] fn main() { let paragraph = r#" I'm just a regular paragraph with the appropriate spacing. "#; println!("{}", paragraph); }Byte Strings
A string literal with the
bprefix is abyte string. A byte string is a slice ofu8values (rather than Unicode text).Strings in Memory
Rust strings are stored in memory using UTF-8 (not as arrays of
chars).A
Stringis stored on the heap as a resizable buffer of UTF-8 text. You can think of aStringas aVec<u8>that is guaranteed to hold well-formed UTF-8.Pronounciation: A
&stris called a "stir" or "string slice".A
&stris a reference to a sequence of UTF-8 text owned by someone else.A
&stris a slice, so it is therefore a fat pointer. You can think of a&stras being nothing more than a&[u8]that is guaranteed to hold well-formed UTF-8.A string literal is a
&strthat refers to preallocated text stored in a read-only memory.Any string type's length (returned by
.len()) is measured in bytes, not characters.It is impossible to modify a
&str:#![allow(unused)] fn main() { let mut s = "hello"; s[0] = 'c'; // &strs cannot be mutably indexed }
StringWays to create a
String:
- Given a
&str, the.to_string()method will copy it into aString.- The
format!()macro works just likeprintln!(), except that it returns a newStringinstead of writing text to stdout, nor does it automatically add a newline at the end.- Arrays, slices, and vectors of strings have two methods that form a new
Stringfrom many strings:
.concat().join(sep)#![allow(unused)] fn main() { let elves = vec!["snap", "crackle", "pop"]; println!("{:?}", elves.concat()); println!("{:?}", elves.join(", ")); }A
&strcan refer to both a string literal or aString, so it's the most appropriate for function arguments when the caller should be allowed to pass either kind of string.Unlike other languages, Rust strings are strictly Unicode only. This means that they're not always the appropriate choice for string-like data. Here are some situations where they're not the correct choice:
When you have Use Unicode text Stringor&strFilename std::path::PathBufand&PathBinary data Vec<u8>and&[u8]Environment variables OsStringand&OsStrStrings from a FFI std::ffi::CStringand&CStrOwnership
In Rust, every value has a single owner that determines its lifetime. When the owner is freed--aka dropped--the owner of the value is dropped too.
A variables owns its value. When control leaves the block in which the variable is declared, the variable is dropped.
Owners and their owned values form trees. Every value in a Rust program is a member of some tree, rooted in some variable.
In their purest form, Rust's ownership model is too rigid to be usable. But, the language provides several mechanisms to make it work:
- Values can be moved from one owner to another
- The std library provides reference-counted pointer types--
RcandArc--which allows a value to have multiple owners, with some restrictions- References can be borrowe from values; references are non-owning pointers with limited lifetimes
Moves
For values of most types, operations like assignment to variables, passing to functions, or returning from functions don't copy values: they move it.
In Rust, assignments of most types move the value from the source to the destination, leaving the source uninitialized.
#![allow(unused)] fn main() { let name_1 = vec!["alex", "eden"]; let name_2 = name_1; // moves name_1's heap memory to name_2 println!("{:?}", name_1); // fails - name_1 has become uninitialized }For the above to work, we have to explicity ask for copies of the values using
.clone(), which is built into most types.#![allow(unused)] fn main() { let name_1 = vec!["alex", "eden"]; let name_2 = name_1.clone(); println!("{:?}", name_1); }More Operations that Move
If you move a value into a variable (via assignment) that was already initialized, Rust drops the variable's prior value.
Passing arguments to functions moves ownership to the function's parameters; returning a value from a function move ownership to the caller.
Building a tuple moves ownership of the values into the tuple structure itself. The same applies to other complex types.
Keep in mind that transfer of ownership does not imply a change in the owned heap storage. Moves apply to the value proper, which for types like vectors and strings, are the three-word header stored on the stack that represents the variable.
Moves and Control Flow
As a general principle, if it's possible for a variable to have had its value moved away, and it hasn't definitely been given a new value since, it's considered uninitialized.
Moves and Indexed Content
Copy Types: The Exception to Moves
In general, most types are moved. The exception are types that implement the
Copytrait. In these cases, the value is copied, rather than moved. This applies to all types of moves, including passingCopytypes to functions and constructors.The standard
Copytypes include all the machine integers and floating-point numeric types, thecharandbooltypes, and a few others. A tuple or fixed-size array ofCopytypes is itself aCopytype.As a rule of thumb, any type that needs to do something special when a value is dropped cannot
Copy. Vectors, files, mutexes, etc. cannot beCopytyped.By default,
structandenumtypes are notCopy, but they can be, if their fields are themselvesCopy.To make a type
Copy, add the attribute#[derive(Copy, Clone)]above its definition.In Rust, every move is a byte-for-byte, shallow copy that leaves the source uninitialized. Copies are the same, except that the source remains initialized.
RcandArc: Shared Ownership
Arcstands for atomic reference count.Rcstands for reference count.The difference between
ArcandRcis that anArcis safe to share between threads directly, whereas aRcuses faster non-thread-safe code to update its reference count.If you don't need to share pointers between threads, use
Rc, rather than suffer the performance penalty of usingArc.For any type
T, anRc<T>value is a pointer to a heap-allocatedTthat has had a reference count affixed to it. Cloning anRc<T>value does not copy theT; rather, is creates another pointer to it and increments the reference count.A value owned by an
Rcpointer is immutable.The main risk with using
graph LR f-->a c-->d subgraph `RcRcpointers to manage memory is that if there are ever twoRcvalues to point to each other, each will keep the other's reference count always above 0, and neither will ever be freed. This is called a reference cycle.` a[1]---b[ ] b---c[ ] end subgraph `Rc ` d[1]---e[ ] e---f[ ] end Example of a reference cycle
The workaround for avoiding reference cycles is using a language mechanism called interior mutability.
References
Pointers can be categorized into two types:
- Owning
- Nonowning
With owning pointers (
Box<T>s,Vecs,Strings, etc), when the owner is dropped, the referent goes with it.Nonowning pointers on the other hand have no effect on their referents' lifetimes.
Terminology: Nonowning pointer types are called references.
References must never outlive their referents. Rust refers to creating a reference to some value as borrowing the value: what gets borrowed, must eventually be returned to the owner.
References let you access values without affecting their ownership.
There are two kinds of references:
- Shared
&T- Mutable
&mut TA shared reference lets you read but not modify its referent. There is no limit to the number of shared references that can refer to the same value. Shared references are
Copytype.A mutable reference let you read and modify its referent. But, if a value is the referent of a mutable reference, you may not have any other references of any sort to the value active at the same time. Mutable reference are not
Copy.The distiction between can be thought of as a
multiple readersv.single writerrule.When one or more shared references to a value exist, not even its owner can modify it. The value is locked.
When a mutable reference to a value exists, only the reference itself may access it; not even its owner.
Concept: When a value is passed to a function in way that moves ownership of the value to the function, we say that it's passed by value. When a function is passed a reference to a value, we say that it's passed by reference.
References as Values
Rust References vs. C++ References
Implicit behavior: Since references are so widely used in Rust, the
.operator implicitly dereferences its left operand, if needed.Shorthand: Provided the above implicit behavior, for a reference named
some_refof type&T, whereThas a field namedx, the following two statements are equivalent:
some_ref.x(*some_ref).xImplicit behavior: The
.operator will also implicitly borrow a reference to its left operand, if needed for a method call.Shorthand: Provided the above implicit behavior, given a mutable value named
vof typeVec<u64>, the following two calls toVec'ssortmethod are equivalent:
v.sort()(&mut v).sort()Assigning References
Assigning to a Rust reference makes it point at a new value:
#![allow(unused)] fn main() { let x = 10; let y = 20; let mut r = &x; println!("r equals {}", *r); r = &y; // assign to r println!("r equals {}", *r); }References to References
Rust allows references to references, and the
.operator follows as many references it needs to find the target value:#![allow(unused)] fn main() { struct Point { x: usize, y: usize } let point = Point { x: 1000, y: 750 }; let r: &Point = &point; let rr: &&Point = &r; let rrr: &&&Point = &rr; println!("x equals {}", rrr.x); }Comparing References
Much like the
.operator, Rust's comparison operators will also "see through" any number of references as are necessary, as long as both operands have the same type.If you actually want to know whether to references point to the same address in memory, use
std::ptr::eq, which compares the references as addresses.#![allow(unused)] fn main() { let x = 10; let y = 10; let rx = &x; let ry = &y; let rrx = ℞ let rry = &ry; println!("rrx and rry are equal? {}", rrx == rry); println!("addresses are equal? {}", std::ptr::eq(rrx, rry)); }References are Never Null
In Rust, if you need a value that is either a reference to something or not, use the type
Option<&T>.At the machine level, Rust represents
Noneas a null pointer, andSome(r), whereris a&Tvalue, as the nonzero address.Borrowing References to Arbitrary Exceptions
References to Slices and Trait Objects
Term: A fat pointer is a two-word (
2 * usize) value on the stack that carries the address of its referent, along with some further information necessary to to put the value ot use.There are two kinds of fat pointers:
- Slice references
- Trait objects
A reference to a slice is a fat pointer:
- 1st word: The starting address of the slice
- 2nd word: The slice's length
Term: A trait object is a fat pointer referencing a value that implements a certain trait. A trait object carries:
- 1st word: A value's address
- 2nd word: A pointer to the trait's implementation appropriate to the pointed-to value for invoking the trait's methods
Reference Safety
The following sections pertain to Rust's reference rules and how it foils any attempt to break them.
Borrowing a Local Variable
Rust tries to assign each reference type in your program a lifetime that meets the contraints imposed by how it's used.
Term: A lifetime is some stretch of a program for which a reference could be safe to use; eg: a lexical block, a statement, an expression, the scope of some variable, etc.
Lifetimes are figments of Rust's imagination; they only exist as part of the compilation process and have no runtime representation.
Receiving References as Parameters
Term: A static is Rust's equivalent of a global (as is, lifetime, not visibility) variable. It's a value that's created when the program starts and lasts until the program terminates.
Some rules for statics (there are more):
- Every static must be initialized at the time of declaration
- Mutable statics are not thread-safe and may only be accessed within an
unsafe {}blockSyntax: The following code is a general syntax for specifying a function parameter's lifetime:
fn f<'a>(p: &'a i32) { ... }Here, we'd say that the lifetime
'ais a lifetime parameter off. We can read<'a>as "for any lifetime'a, so in the above expression, we're definingfas a function that takes a reference to ani32with any given lifetime'a.Passing References as Arguments
You only need to worry about lifetime parameters when defining functions and types; when using them, Rust infers the lifetimes for you.
Returning References
Implicit behavior: When a function takes a single reference as an argument, and returns a single reference, Rust assumes that the two must have the same lifetime. This means that the following two expressions are equivalent:
fn smallest<'a>(v: &'a [i32]) -> &'a i32 { ... }
fn smallest(v: &[i32]) -> &i32 { ... }Structs Containing References
Whenever a reference type appears inside another type's definition, you must write out its lifetime.
Given the above statement, we know that the following will fail to compile:
#![allow(unused)] fn main() { struct S { r: &i32 } let x = 10; let s = S { r: &x }; println!("{}", s.r); }The fix here is to provide the lifetime parameter of
rin the definition ofS:#![allow(unused)] fn main() { struct S<'a> { r: &'a i32 } let x = 10; let s = S { r: &x }; println!("{}", s.r); }A type's lifetime parameters always reveal whether it contains references with interesting (aka,
'static) lifetimes, and what those lifetimes can be.Distinct Lifetime Parameters
When defining a types or functions that have or receive multiple references, a distinct lifetime parameter should be defined for each.
// Types struct S<'a, 'b> { x: &'a i32; y: &'b i32 } // Functions fn f<'a, 'b>( x: &'a i32, y: &'b i32, ) -> &'a i32 { r }Omitting Lifetime Parameters
Shorthand: If you function doesn't return any references (or other types that require lifetime parameters), then you never need to write out lifetimes for the parameters.
#![allow(unused)] fn main() { struct S<'a, 'b> { x: &'a i32, y: &'b i32 } fn sum_r_xy(r: &i32, s: S) -> i32 { r + s.x + s.y } }The above is shorthand for:
fn sum_r_xy<'a, 'b, 'c>(r: &'a i32, s: S<'b, 'c>) -> i32 { ... }
Shorthand: If there's only a single lifetime that appears among your function's parameters, then Rust assumes any lifetimes in the return must be the one defined.
#![allow(unused)] fn main() { fn first_third(point: &[i32; 3]) -> (&i32, &i32) { (&point[0], &point[2]) } }The above is shorthand for:
fn first_third<'a>(point: &'a [i32; 3]) -> (&'a i32, &'a i32) { ... }
Shorthand: If your function is a method on some type and takes its
selfparameter by reference, Rust assumes thatself's lifetime is the one to give any references in the return value.#![allow(unused)] fn main() { struct StringTable { elements: Vec<String>, } impl StringTable { fn find_by_prefix(&self, prefix: &str) -> Option<&String> { for i in 0 .. self.elements.len() { if self.elements[i].starts_with(prefix) { return Some(&self.elements[i]); } } None } } }The above method's signature is shorthand for:
fn find_by_prefix<'a, 'b>(&'a &self, prefix: &'b str) -> Option<&'a String>Sharing vs. Mutation
Shared access
A value borrowed by shared references is read-only.
Across the lifetime of a shared reference, neither its referent, nor anything reachable from that referent, can be changed by anything.
Mutable access
A value borrowed by a mutable reference is reachable exclusively via that reference.
Across the lifetime of a mutable reference, there is no other usable path to its referent, or to any value reachable from there.
The only references whose lifetimes may overlap with a mutable reference are those you borrow from the mutable reference itself.
Expressions
Blocks and Semicolons
When you see
expected type '()', look for a missing semicolon first.Empty statements are allowed in blocks. They consist of a stray semicolon all by itself.
Declarations
Syntax: The simplest kind of declaration is a
letdeclaration, which declares local variables:let name: type = expr;The type and initializer are optional. The semicolon is required.
Term: An item declaration is a declaration that could appear globally in a program or module, such as a
fn,struct, oruse.When a
fnis declared inside a block, its scope is the entire block (no TDZ)--that is, it can be used through the enclosing block. But a nestedfncannot access local variables or arguments that happen to be in scope. (the alternative to nested function are closure, which do have access to enclosing scope).
ifandmatchExpressions used as conditions in
ifexpressions must be of typebool.An
ifexpression with noelseblock behaves exactly as though it had an emptyelseblock.Syntax: The general form of a
matchexpression is:match value { pattern => expr, ... }
if letSyntax: The last
ifform is theif letexpression:if let pattern = expr { block1 } else { block2 }It's never strictly necessary to use
if let, becausematchcan do everythingif letcan do.Shorthand: An
if letexpression is shorthand for amatchwith just one pattern:match expr { pattern => { block1 } _ => { block2 } }Loops
Loops are expressions in Rust, but they don't produce useful values.
The value of a loop is
().Operator: The
..operator produces a range of typestd::ops::Range. A range is a simple struct with two fields:startandend.Ranges can be used with
forloops becauseRangeis an iterable type.Term: An iterator type is a type that implements the
std::iter::IntoIteratortrait.Iterating over a
mutreference provides amutreference to each element:#![allow(unused)] fn main() { let mut strings: Vec<String> = vec![ "what's".to_string(), "my".to_string(), "line?".to_string(), ]; for rs in &mut strings { rs.push('\n'); } println!("{}", strings.join("")); }A loop can be labeled with a lifetime. In the below example,
'search:is a label for the outerforloop. Thusbreak 'searchexits that loop, not the inner loop (breaks can also be used withcontinue):#![allow(unused)] fn main() { 'search: for room in apartment { for spot in room.hiding_spots() { if spot.contains(keys) { println!("Your keys are {} in the {}.", spot, room); break 'search; } } } }
returnExpressionsShorthand: A
returnwithout a value is shorthand forreturn ().Why Rust Has Loop
Expressions that don't finish normally are assigned the special type
!, and they're exempt from the rules about types having to match.Term: A function that never returns--that is, returns
!--is called a divergent function.An example of a divergent funtion is
std::process::exit, which has the following type signature:fn exit(code: i32) -> !;Function and Method Calls
The difference between static and nonstatic methods is the same as in OO languages: nonstatic methods are called on values (like
my_vec.len()), and static methods are called on types themselves (likeVec::new()).It's considered good style to omit types whenever they can be inferred.
Fields and Elements
Fields of a struct are accessed using the familiar
.operator. Tuples are the same, except that their fields have numbers rather than names.Square brackets access the elements of an array, a slice, or a vector
Reference Operators
Operator: The unary
*operator is used to access the referent of a reference.The
*operator is only necessary when we want to read or write the entire value that the reference points to.Arithmetic, Bitwise, Comparison, and Logical Operators
Warning: Dividing an integer by zero trigger a panic, even in releases builds.
Integers have a method
a.checked_div(b)that returnsOption<I>and never panics. If ever there's the slightest possibility that an integer will be divided by zero, usechecked_div.There is no unary
+operator.Assignment
Syntax: Rust does not support chained assignment.
a = b = cwill not work.Syntax: Rust does not have increment and decrement operators:
++and--.Type Casts
Casting an integer to another integer type is always well-defined. Converting to a narrow type results in truncation.
Casting a large floating-point value to an integer type that is too small to represent it can lead to undefined behavior. (this might no longer be true in newer versions of Rust).
Implicit behavior: Values of type
&Stringauto-convert to type&strwithout a cast.Implicit behavior: Values of type
&Vec<i32>auto-convert to&[i32].Implicit behavior: Values of type
&Box<T>auto-convert to&T.Term: The above implicit behaviors are called deref coercions, because they apply to types that implement the
Derefbuilt-in trait. The purpose ofDerefcoercion is to make smart pointer types, likeBox, behave as much like the underlying value as possible.Error Handling
There are two types of error-handling in Rust:
- panic
ResultsOrdinary errors are handled using
Results.Panic is the bad kind, it's for errors that should never happen.
Panic
Term: A program panics when it encounters something so messed up that there must be a bug in the program itself.
These are some things that can cause a panic:
- Out-of-bounds array access
- Integer division by zero
- Calling
.unwrap()on anOptionthat happens to beNone- Assertion failure
Behavior: When a program panics, you can choose one of two ways that it'll be handled:
- Unwind the stack (this is the default)
- Abort the process
Unwinding
Process of a panic-triggered unwinding
- An error message is printed to the terminal.
- The stack is unwound.
- Any temporary values, local variables, or arguments that the current function was using are dropped, in the reverse of the order they were created. This perpetuates upwards through the unwound stack.
- The thread exits. If the panicking thread was the main thread, then the whole process exits (with a nonzero exit code).
A panic is not a crash, nor undefined behavior. A panic's behavior is well-defined and safe.
Behavior: Panics occur per thread. One thread can be panicking while other threads are going on about their normal business.
It's possible to catch stack unwinding, which would allow the thread to survive and continue running using the standard library function
std::panic::catch_unwind().Aborting
Stack unwinding is the default panic behavior, but
Behavior: There are two circumstances in which Rust does not try to unwind the stack:
- If a
.drop()method triggers a second panic while Rust is still trying to clean up after the first. The process will be aborted.- If you compile with a
-C panic=abortflat, the first panic in the program immediately aborts the process. (this can be used to reduce compiled code size)Result
Rust doesn't have exceptions.
Catching Errors
The most thorough way of dealing with errors via
Resultis using amatchexpression:#![allow(unused)] fn main() { match get_weather(hometown) { Ok(result) => { display_weather(hometown, &report); } Error(err) => { println!("error querying the weather: {}", err); schedule_weather_retry(); } } }Add notes about the `Result` methods starting on page 148.
matches can be a bit verbose. But,Resultcomes with a ton of useful methods for more concise handling.Result Type Aliases
Sometimes you'll see Rust documentation that seems to omit the error type of a
Result. In such cases, aResulttype alias (a type alias is a shorthand for type names) is being used:fn remove_file(path: &Path) -> Result<()>Printing Errors
All error types implement a common trait:
std::error::Error.Warning: Printing an error value does not also print out its cause. If you want to print all available information for an error, use the
print_errorfunction defined below.#![allow(unused)] fn main() { use std::error::Error; use std::io::{Write, stderr}; /// Dump an error message to `stderr` /// If another error occurs in the process, ignore it fn print_error(mut err: &Error) { let _ = writeln!(stderr(), "error: {}", err); while let Some(cause) = err.cause() { let _ = writeln!(stderr(), "caused by: {}", cause); err = cause; } } }Crate: The standard library's error types do not include a stack trace, but the
error-chaincrate makes it easy to define your own custom error type that supports grabbing a stack trace when it's created. It uses thebacktracecrate to capture the stack.Propagating Errors
Operator: You can add a
?to any expression that produces aResult. The behavior of?depends on the state (OkorErrorof theResult):
- If
Ok, it unwraps theResultto get the success value inside- If
Error, it immediately returns from the enclosing function, passing the error result up the call chain (see the rule below)Rule: The
?can only be used in functions whose return value is of typeResult.Working with Multiple Error Types
Some functions have the potential to return
Errors of a many different type (depending on the operation that triggered the error).There are several approaches to dealing with multiple error types:
- Conversion: Define a custom error type (say,
CustomError) and implement conversions fromio::Errorto the custom error type.- Box 'em up: The simpler approach is to use pointers. All error types can be converted to the type
Box<std::error::Error>, which represents "any error", so we can define a set of generic type aliases all possible errors. This is the most idiomatic approach..For generalizing all errors and results, define these type aliases:
#![allow(unused)] fn main() { type GenError = Box<std::error::Error>; type GenResult<T> = Result<T, GenError>; }Tip: To convert any error to the
GenErrortype, callGenError::from().The downside of the
GenErrorapproach is that the return type no longer communicates precisely what kinds of errors the caller can expect.Tip: If you want to handle on particular kind of error, but let all other propagate out, use the generic method
error.downcast_ref::<ErrorType>(). This is called error downcasting.Dealing with Errors That "Can't Happen"
Operator: Instead of the
?operator, which requires implementing error-handling, we can use the.unwrap()method of aResultto get theOkvalue.Warning: The difference between
?and.unwrap()is that if.unwrap()is used onResultthat's in itsErrorstate, the process will panic. In other words, only use.unwrap()when you're damn sureResultisOk.Ignoring Errors
Idiom: If we really don't care about the contents of a
Result, we can use the following idiomatic statement to silence warnings about unused results:let _ = writeln!(stderr(), "error: {}", err);Handling Errors in
main()If you propagate an error long enough, eventually it'll reach the root
main()function, at which point, it can no longer be ignored.Info: The
?operator cannot be used inmainbecausemain's return type is not aResult. Instead, use.expect().Behavior: Panicking in the main thread print an error message then exits with a nonzero exit code.
Crates and Modules
Crates
The easiest way to see what crates are and how the work is to use
cargo buildwith the--verboseflat to build an existing project that has some dependencies.When compiling libraries, Cargo uses the
--crate-type liboption. This tellsrustcnot to look for amain()function but instead to produce a .rlib file containing compiled code in a form that laterrustccommands can use as input.When compiling a program, Cargo uses
--crate-type bin, and the result is a binary executable for the target platform.With each
rustccommand, Cargo passes--externoptions giving the filename of each library the crate will use. This ties directly into theextern crate some_crate;statements in source code.Command: The command
cargo build --releasewill produce an optimized release build.Qualities of a release build:
- Run faster
- Compile slower
- Don't check for integer overflow
- Skip
debug_assert!()assertions- Less reliable and verbose stack traces
Build Profiles
The following CLI commands are used to select a
rustcprofile:
Command Cargo.tomlsection usedcargo build[profile.debug]cargo build --release[profile.release]cargo test[profile.test]Behavior: If no profile is specified,
[profile.debug]is selected by default.Tip: To get the best data from a profiler, you need both optimizations and debug symbols to be enabled. To do so, add this to your cargo config:
[profile.release] debug = true # enable debug symbols in release buildsModules
Concept: Modules are Rust's namespaces. Whereas crates are about code sharing between projects, modules are about code organization within a project.
Term: A modules is a collection of items.
Behavior: Any module item not marked
pubis private.Behavior: Modules can be nested. It's common to see a module that's a collection of submodules:
#![allow(unused)] fn main() { mod life { pub mod animalia { pub mod mammalia {} } pub mod plantae {} pub mod fungi {} pub mod protista { pub mod archaea {} pub mod bacteria {} } }It's generally advised not to keep all source code in a single massive file of nested modules. For obvious reasons.
Modules in Separate Files
Behavior: Writing a module inline like
mod life;tells the compiler that thelifemodule lives in a separate file calledlife.rs.When you build a Rust crate, you're recompiling all of its modules, regardless of where those modules live.
Behavior: A module can have its own directory. When Rust sees
mod life;, it checks for bothlife.rsandlife/mod.rs.Concept: A
mod.rsfile is exactly like a barrel (index.js) in JS/TS.Paths and Imports
Operator: The
::operator is used to access the items of a module. e.g.life::animalia::mammalia::....Paths to items can be either relative or absolute. An absolute path is prefixed by
::and can be used to access "global" items, e.g.::std::mem::swapis an absolute path.Concept: Accessing an absolute path is a lot like accessing the global object in JS, ie
window, in a browser.Operator: The
usedeclaration creates aliases to modules and items through the enclosing block or module. e.g.use std::mem;create a local alias to::std::mem's items.It's generally considered best style to import types, traits, and modules, then use relative paths to access the items within them.
Several items from the same module can be imported at once, as can all items:
#![allow(unused)] fn main() { use std::collections::{HashMap, HashSet}; // import just two items use std::io::prelude::*; // import all items }Operator: Modules do not automatically inherit items from their parent modules. The
superkeyword can be as an alias for the parent module, andselfis an alias for the current module.Submodules can access private items in their parent modules, but they have to import them by name.
use super::*;will only import thepubitems.Modules aren't the same thing as files, but there some analogies between module paths and file paths:
Module path File path Description self"."Accesses the current module super".."Accesses the parent module extern crateSimilar to mounting a filesystem The Standard Prelude
Implicit Behavior: The standard library
stdis automatically linked with every project, as are some items from the standard prelude likeVecandResult. It's as though the following imports are invisibly added to all files:#![allow(unused)] fn main() { extern crate std; use std::prelude:v1::*; }Convention: Naming a module
preludetells users that it's meant to be imported using*.Items: The Building Blocks of Rust
Items make up the composition of modules. The list of items is really a list of Rust's features as a language:
Items Keywords Functions fnTypes struct,enum,traitType aliases typeMethods implConstants const,staticModules modImports use,extern crateFFI blocks externItem: Types
User-defined types are introduced using
struct,enum, andtraitkeywords.A struct's field, even private fields, are accessible through the module where the struct is declared. Outside of the module, only
pubfields are visible.Item: Methods
An
implblock can't be markedpub; rather its methods can be markedpubindividually.Private methods, like private struct fields, are visible throughout the module where they're declared.
Item: Constants
The
constkeyword introduces constant.constsyntax is just likelet, except that the type must be defined, and it may or may not be markedpub.Convention:
UPPER_CASE_NAMESare conventional for naming constants.Concept: A
constis a bit like the#define:preprocessor directive in C++, and as such they should be used for specifying magic numbers and strings.Item: Imports
Even though
useandextern cratedeclarations are just aliases, they can also be markedpub. In fact, the standard prelude is written as a big series ofpubimports.Item: FFI blocks
externblocks declare a collection of functions written in some other language so that they can be called from Rust.Turning a Program into a Library
These are roughly the steps to convert a program into a library:
- Change the name of
src/main.rstosrc/lib.rs- Add the
pubkeyword to public features of the library.- Move the
mainfunction to a temporary file somewhere.Term: The code in
src/lib.rsforms the root module of the library. Other crates that use your library can only access the public items of this root module.The src/bin Directory
Cargo has built-in support for small programs that live in the same codebase as a library.
The
main()function that we stowed away in the above steps for converting our code to library should be moved to a file namedsrc/bin/my_program.rs. The file needs to then import the library as it would any other crate:#![allow(unused)] fn main() { extern crate my_library; use my_library::{feature_1, feature_2}; }That's it!
Attributes
Tests and Documentation
Tests are just ordinary functions marked with the
#[test]attribute.To test error cases, add the
#[should_panic]attribute to your test. This tells the compiler that we expect this test to panic:#![allow(unused)] fn main() { #[test] #[should_panic(expected="divide by zero")] fn test_divide_by_zero_error() { 1 / 0; } }Convention: When your tests gete substantial enough to require support code, the convention is to put them in a
testsmodule an declare the whole module to be testing-only using the#[cfg(test)]attribute.Integration Tests
Term: Integration tests are
.rsfiles that live in atestsdirectory alongside your project'ssrcdirectory. When you runcargo test, Cargo compiles each integration test as a separate, standalone crate, linked with your library and the Rust test harness.Since integration tests use your program as if it were a separate crate, you must add
extern crate my_library;to them.Documentation
Come back to this
Doc-Tests
Come back to this
Specifying Dependencies
Generally in a Cargo.toml, you're used to seeing items in the
[dependencies]section that are specified by version number and look like this:[dependencies] num = "0.1.42"The above convention is fine, but it only allows use of crates published on crates.io.
Remote Git Dependencies
To use a dependency by referencing a git repo, specify it like this:
my_crate = { git = "https://github.com/Me/my_crate.git", rev = "093f84c" }Local Dependencies
To use a dependency by referencing a local crate, specify it like this:
my_crate = { path = "../path/to/my_crate" }Versions
Come back to the this.
Cargo.lock
Cargo upgrades dependencies to newer version only when you tell it to using
cargo update, in which case it only upgrades to the latest dependency versions that are compatible with what's specified in Cargo.toml.Publishing Crates to crates.io
The command
cargo packagecreates a file containing all your library's source files, including Cargo.toml, which is what will be uploaded to crates.io.Before publishing, you have to log in locally using
cargo login <API key>. Go here to get an API key.Workspaces
Given a root directory that contains a collection of crates, you can save compilation time and disk space by creating workspace.
All that's needed is a Cargo.toml file in the root directory:
[workspace] members = ["my_first_crate", "my_second_crate"]With that you're free to delete any
Cargo.lockandtargetdirectories that exist in the subdirectories. AllCargo.lockand compiled resources will all be grouped at a single location in the root directory.With workspaces,
cargo build --allin any crate will build all crate in the root directory. The same goes forcargo testandcargo doc.Structs
Rust has three kinds of structures:
- Named-field
- Tuple-like
- Unit-like
Term: The values contained within a struct, regardless of struct type, are called components.
A named-field struct gives a name to each component. A tuple-like struct identifies them by the order in which they appear. Unit-like structs have no components at all.
Structs are private by default, visible only in the module where they're declared. The sames goes for their fields.
Named-Field Structs
Convention: All types, structs includes, should have names in PascalCase.
Term: A struct expression is an expression that constructs a struct type:
#![allow(unused)] fn main() { // struct expression (similar to a constructor) let r = std::ops::Range { start: 0, end: 9, }; println!("Range length: {}", r.len()); }Operator: If in a struct expression, the named fields are followed by
..EXPR, then any fields not mentioned take their values fromEXPR, which must be another value of the same struct type.#![allow(unused)] fn main() { let range_1 = 0..100; println!("First range length: {}", range_1.len()); let range_2 = std::ops::Range { start: 50, ..range_1 }; println!("Second range length: {}", range_2.len()); }Tuple-Like Structs
Term: The values held by a tuple-like structs are called elements.
Implicit Behavior: When you define a tuple-like struct, you implicitly create a function that constructs it:
struct Bounds(usize, usize);This implicitly created this function:
fn Bounds(el0: usize, el1: usize) -> Bounds { ... }Tuple-like structs are good for newtypes.
Term: Structs with a single component that you define to get stricter type checking are called newtypes.
Unit-Like Structs
Unit-like struct occupies no memory, much like the unit-type
(). They're generally helpful when defining traits.Struct Layout
In memory, both named-field and tuple-like structs are the same thing.
Defining Methods with
implConcept: Rather than appearing inside the struct definition, as in C++ or Java, Rust methods appear in a separate
implblock.An
implblock is a collection offndefinitions, each of which becomes a method on the struct type named at the top of the block.#![allow(unused)] fn main() { #[derive(Debug)] struct Point { x: i64, y: i64 } impl Point { // Static method that creates a new point from a tuple fn of(x: i64, y: i64) -> Self { Point { x, y } } // Method that swaps the x and y values fn inverse(self: &Self) -> Self { Point { x: self.y, y: self.x } } // Method that mutably add one points to this one fn add(self: &mut Self, add: &Self) { self.x = self.x + add.x; self.y = self.y + add.y; } } let mut p1 = Point::of(1, 2); println!("{:?}", p1); let p2 = p1.inverse(); println!("{:?}", p2); p1.add(&p2); println!("{:?}", p1); }Term: Methods defined in
impls are called associated functions, since they're associated with a specific type. The opposite of an associated function (one not associated with any type) is called a free function.Shorthand: Inside
implblocks, Rust automatically creates a type alias of the type for which theimplblock is associated calledSelf.A Rust method must explicitly use
selfto refer to the value it was called on.Implicit Behavior: When you call a method, you don't need to borrow a mutable reference yourself; the ordinary method call syntax takes care of that implicitly. For example, in the above code, we call
p1.add(&p2). This is the same as if we had called(&mut p1).add(&p2).Term: Methods in
implblocks that don't takeselfas an argument become functions associated with the struct type itself, rather that a specific value of the type. These methods are called static methods. In the above code,Point::ofis a static method of struct typePoint.Convention: It's conventional in Rust for static constructor functions to be named
new.Although you can have many separate
implblocks for a single type, they must all be in the same crate that defines that type.Generic Structs
Term: In generic struct definitions, the type names used in
are called type parameters.Pronunciation: You can read the line
impl<T> Queue<T>as something like "for any type T, here are some methods available on Queue". Operator: For static method calls whose generic type parameter cannot be inferred, you can use the turbofish
::<>operator to specify the type:#![allow(unused)] fn main() { let mut q = Queue::<char>::new(); }Structs with Lifetime Parameters
Just as structs can have generic type parameters, they can have lifetime parameters as well.
Pronunciation: You can read the line
struct Extrema<'elt>as something like, "given any specific life'elt, you can make anExtrama<'elt>that holds references with that lifetime.Rust always infers lifetime parameters for calls.
Shorthand: Because it's so common for the return type to use the same lifetime as an argument, Rust lets us omit the lifetimes when there's one obvious candidate.
Interior Mutability
Interior mutability is the principle of making a bit of data mutable inside an otherwise immutable value.
The two most straightforward mechanisms for implementing interior mutability are
Cell<T>andRefCell<T>.A
Cell<T>is a struct that contains a single private value of typeT. The only special thing about aCellis that you can get and set the field even if you don't havemutaccess to theCellitself.Warning:
Cells, and any types that contain them, are not thread-safe.Come back to this. Probably won't need it for a while.
Enums and Patterns
Enums
Term: The values that comprise enums are called variants or constructors.
As with structs, the compiler will implement features like
==operator for you, but you have to ask. Also as with structs, enums can have methods withinimplblocks.#![allow(unused)] fn main() { #[derive(Copy, Clone, Debug, PartialEq)] enum TimeUnit { Seconds, Minutes, Hours, Days, Months, Years, } impl TimeUnit { // Return the plural noun for this time unit fn plural(self) -> &'static str { match self { TimeUnit::Seconds => "seconds", TimeUnit::Minutes => "minutes", TimeUnit::Hours => "hours", TimeUnit::Days => "days", TimeUnit::Months => "months", TimeUnit::Years => "years", } } // Return the singular noun for this time unit fn singular(self) -> &'static str { self.plural().trim_right_matches('s') } } }Enums with Data
Term: Enum constructors that take arguments that resemble tuples are called tuple variants. The constructors that take struct arguments are called struct variants. Constructors that take no arguments are called unit-like variants.
#![allow(unused)] fn main() { // Enum with tuple variants enum RoughTime { InThePast(TimeUnit, u32), JustNow, InTheFuture(TimeUnit, u32), } }#![allow(unused)] fn main() { // Enum with struct variants enum Shape { Sphere { center: Point3d, radius: f32 }, Cuboid { corner1: Point3d, corner2: Point3d } } }A single enum have can variants of all three kinds:
#![allow(unused)] fn main() { // Enum with unit-like, tuple, and struct variants enum RelationshipStatus { Single, InARelationShip, ItsComplicated(Option<String>), ItsExtremelyComplicated { car: DifferentialEquation, cdr: EarlyModernistPoem, } } }All constructors and fields of a public enum are automatically public.
Enums in Memory
In memory, enums with data are stored as a small integer tag, plus enough memory to hold all of the fields of the largest variant. The tag tells Rust which constructor created the value, and therefore which fields it has.
Generic Enums
Enums can be generic, and generic data structures can be built with a few lines of code:
#![allow(unused)] fn main() { // An ordered collection of T's #[derive(Debug)] enum BinaryTree<T> { Empty, NonEmpty(Box<TreeNode<T>>), } // A node within the binary tree #[derive(Debug)] struct TreeNode<T> { element: T, left: BinaryTree<T>, right: BinaryTree<T>, } let tree = BinaryTree::NonEmpty(Box::new(TreeNode { element: "I'm a single-node tree", left: BinaryTree::Empty, right: BinaryTree::Empty, })); println!("{:?}", tree); }Patterns
matchperforms pattern matching. Think of it this way:
- Expressions produce values
- Patterns consume values
When a pattern contains identifiers, those become local variables in the code following the pattern.
Literals, Variables, and Wildcards in Patterns
Term: If you need a catch-all pattern, but don't care about the matched value, you can use a single underscore
_as a pattern, called the wildcard pattern.Rust requires that every single possible value is handled in a
matchblock. So even if you're certain that remaining cases can't occur, you at least add a fallback arm that panics:Warning: Existing variables can't be used in patterns. This is because identifiers in patterns may only introduce new variables.
#![allow(unused)] fn main() { // This will fail because current_hex is an existing variable fn check_move(current_hex: Hex, click: Point) -> game::Result<Hex> { match point_to_hex(click) { None => Err("That's not a game space."), Some(current_hex) => Err("You're already there! Click somewhere else."), Some(other_hex) => Ok(other_hex), } } }Tuple and Struct Patterns
Tuple patterns match tuples.
#![allow(unused)] fn main() { // Describe the location of a point on a Cartesian plane fn describe_point(x: i32, y: i32) -> &'static str { use std::cmp::Ordering::*; match (x.cmp(&0), y.cmp(&0)) { (Equal, Equal) => "at the origin", (_, Equal) => "on the x axis", (Equal, _) => "on the y axis", (Greater, Greater) => "in the first quadrant", (Less, Greater) => "in the second quadrant", _ => "somewhere else", } } }Struct patterns match structs.
#![allow(unused)] fn main() { match balloon.location { Point { x: 0, y } => println!("straight up {} meters", height), Point { x, y } => println!("at ({}m, {}m)", x, y), } }Reference Patterns
For very large struct type, it'd be too cumbersome to write out every single struct field in the pattern. Fortunately, you can use the
..operator to mute the fields you don't care about:#![allow(unused)] fn main() { match account { Account { name, language, .. } => { ui.greet(&name, &language); ui.show_settings(&account); // ERROR! use of moved value 'account' } } }Keyword: The above code will fail because when we use
.., the rest of theAccountstruct is dropped. So we need a pattern that borrow matched values instead of moving them. For that, we haveref(ormut ref, depending on context):#![allow(unused)] fn main() { match account { Account { ref name, ref language, .. } => { ui.greet(&ame, language); ui.show_settings(&account); // OK! } } }Concept: The opposite of a
refpattern is a pattern that starts with&. If a pattern starts with&, that means that it matches a reference:#![allow(unused)] fn main() { match sphere.center() { &Point3d { x, y, z } => { ... } } }You should remember that patterns and expressions are natural opposites:
- The expression
(x, y)makes two values into a new tuple- The pattern
(x, y)matches a tuple and breaks out the two valuesThe same principle applies to references:
- In an expression,
&creates a reference- In a pattern,
&matches a referenceMatching Multiple Possibilities
Operator: The vertical bar
|can be used to combine several patterns in a singlematcharm:#![allow(unused)] fn main() { let at_end = match chars.peek() { Some(&'\r') | Some(&'\n') | None => true, _ => false, }; }You can also use
...to match a whole range of values:#![allow(unused)] fn main() { match next_char { '0' ... '9' => self.read_number(), 'a' ... 'z' | 'A' ... 'Z' => self.read_word(), ' ' | '\t' | '\n' | '\r' => self.skip_whitespace(), _ => self.handle_punctuation(), } }Pattern Guards
Use the
ifkeyword to add a guard to a match arm. But, if a pattern moves any values, you can't put a guard on it.#![allow(unused)] fn main() { match robot.last_known_location() { Some(ref point) if self.distance_to(point) < 10 => short_distance_strategy(point), Some(point) => long_distance_strategy(point), None => searching_strategy(), } }
@PatternsThe
x @ patternmatches like like the givenpattern, but on success, instead of creating variables for parts of the matched value, it creates a single variablexand moves or copies the whole value into it.#![allow(unused)] fn main() { match self.get_selection() { rect @ Shape::Rect(..) => optimized_paint(&rect), other_shape => paint_outline(other_shape.get_outline()), } }The
@pattern is also useful for ranges:#![allow(unused)] fn main() { match chars.next() { Some(digit @ '0' ... '9') => read_number(digit, chars), ... } }Where Patterns are Allowed
Come back to this.
Populating a Binary Tree
Finally we'll go back to the
BinaryTreeenum written earlier and write anaddmethod for it that allows use to easily build a binary tree.#![allow(unused)] fn main() { // An ordered collection of T's #[derive(Debug)] enum BinaryTree<T> { Empty, NonEmpty(Box<TreeNode<T>>), } // A node within the binary tree #[derive(Debug)] struct TreeNode<T> { element: T, left: BinaryTree<T>, right: BinaryTree<T>, } impl<T: Ord> BinaryTree<T> { fn add(&mut self, value: T) { // *self inside a match represents the existing tree match *self { BinaryTree::Empty => *self = BinaryTree::NonEmpty(Box::new(TreeNode { element: value, left: BinaryTree::Empty, right: BinaryTree::Empty, })), BinaryTree::NonEmpty(ref mut node) => if value <= node.element { node.left.add(value); } else { node.right.add(value); } } } } let mut tree = BinaryTree::Empty; for num in 0 .. 10 { tree.add(num); } println!("{:?}", tree); }Traits and Generics
Intro to Traits
Rust's implementation of polymorphism comes from two mechanisms:
- Traits
- Generics
Traits are Rust's take on the interfaces or abstract base classes found in OOP-world.
Here's a condensed version of the
std::io::Writetrait:#![allow(unused)] fn main() { // std::io::Write trait Write { fn write(&mut self, buf: &[u8]) -> Result<usize>; fn flush(&mut self) -> Result<()>; fn write_all(&mut self, buf: &[u8]) -> Result<()>; // There's lots more } }Assume we wait to write a function whose parameter is a value of any type that can write to a stream. It'd look something like this:
#![allow(unused)] fn main() { use std::io::Write; fn say_hello(out: &mut Write) -> std::io::Result<()> { out.write_all(b"hello!\n")?; out.flush(); } }Pronunciation: The parameter of the above
outfunction is of type&mut Write, meaning "a mutable reference to any value that implements theWritetrait.Intro to Generics
A generic function or type can be used with values of many different types.
#![allow(unused)] fn main() { // Given two values, pick whichever one is less fn min<T: Ord>(value1: T, value2: T) -> T { if value1 <= value2 { value1 } else { value2 } } println!("Minimum of two integers: {}", min(1, 2)); println!("Minimum of two strings: {}", min("a", "b")); }Pronunciation: The type parameter of the above
minfunction is written<T: Ord>, meaning "this function can be used with arguments of any typeTthat implements theOrdtrait". Or, more simply, "any ordered type".Term: The
T: Ordrequirement of the aboveminfunction is called a bound.Using Traits
A trait is a feature that any given type may or may not support. Think of a trait as a type capability.
Rule: For trait methods to be accessible, the trait itself must be in scope! Otherwise, all of its methods are hidden.
#![allow(unused)] fn main() { let mut buf: Vec<u8> = vec![]; buf.write_all(b"hello!")?; // ERR: no method named write_all }Adding
use std::io::Write;to the top of the above file will bring theWritetrait into scope and fix the issue.Trait Objects
There are two ways to use traits:
- Trait objects
- Generics
Rust doesn't allow variables of type
Write(the trait) because a variable's size must be known at compile-time, and types that implementWritecan be of any size.#![allow(unused)] fn main() { use std::io::Write; let mut buf: Vec<u8> = vec![]; let writer: Write = buf; // ERR: `Write` does not have a constant size }However, what we can do is create a value that's a reference to a trait.
#![allow(unused)] fn main() { use std::io::Write; let mut buf: Vec<u8> = vec![]; let writer: &mut Write = &mut buf; // OK! }Term: A reference to a trait type, like
writerin the above code, is called a trait object.Trait Object Layout
In memory, a trait object is a fat pointer (two words on the stack) consisting of a pointer to the value, plus a pointer to a table representing that value's type. That table, as is the case with C++, is called a virtual table (vtable).
Implicit Behavior: Rust automatically converts ordinary referencs into trait object when needed. This was the case with the
writervariable in the above code.Generic Functions
Earlier we created a function that accepted any parameter that implemented the
Writetrait (aka, a trait object):#![allow(unused)] fn main() { use std::io::Write; fn say_hello(out: &mut Write) -> std::io::Result<()> { out.write_all(b"hello!\n")?; out.flush(); } }We can make that function generic by tweaking the type signature:
#![allow(unused)] fn main() { use std::io::Write; fn say_hello<W: Write>(out: &mut W) -> std::io::Result<()> { out.write_all(b"hello!\n")?; out.flush(); } }Term: In the above
say_hellofunction, the phrase<W: Write>is what makes the function generic.Wis called a type parameter. And: Write, as mentioned earlier, is the bound.Convention: Type parameters are usually single uppercase letters.
If the generic function you're calling doesn't have any arguments that provide useful clues about the type parameter's type, you might have to spell it out using the turbofish
::<>.Operator: If your type parameter needs to support several traits, you can chain the needed traits together using the
+operator.#![allow(unused)] fn main() { fn top_ten<T: Debug + Hash + Eq>(values: &Vec<T>) { ... } }Generic functions can have multiple type parameters:
#![allow(unused)] fn main() { fn run_query<M: Mapper + Serialize, R: Reducer + Serialize>( data: &DataSet, map: M, reduce: R, ) -> Results { ... } }Keyword: The type parameter bounds in the above
run_queryfunction are way too long and it makes it less readable. Thewherekeyword allows us to move the bounds outside of the<>:#![allow(unused)] fn main() { fn run_query<M, R>(data: &DataSet, map: M, reduce: R) -> Results where M: Mapper + Serialize, R: Reducer + Serialize { ... } }Shorthand: The
whereclause can be used anywhere bounds are permitted: generic structs, enums, type aliases, methods, etc.A generic function can have both lifetime parameters and type parameters. Lifetime parameters come first:
#![allow(unused)] fn main() { // Return a ref to the point in `candidates` that's closest to `target` fn nearest<'t, 'c, P>(target: &'t P, candidates: &'c [P]) -> &'c P where P: MeasureDistance { ... } }Which to Use
Tip: Traits objects are the right choice whenever you need a collection of values of mixed types, all together. (think salad)
Generics have two major advantages over trait objects:
- Speed. When the compiler generates machine code for a generic function, it knows which types it's working with, so it knows at that time which
writemethod to call. No need for dynamic dispatch. Wheras with trait objects, Rust never knows what type of value a trait object points to until runtime.- Not every trait can support trait objects.
Defining and Implementing Traits
Defining a trait is just a matter of giving it a name and a list of type signatures of the trait's methods.
#![allow(unused)] fn main() { /// A trait for entities in a videogame's world that are displayed on a screen trait Visible { /// Render the object on the given canvas fn draw(&self, canvas: &mut Canvas); /// Return true if clicking at (x, y) should select this object fn hit_test(&self, x: i32, y: i32) -> bool; } }Syntax: The syntax for implementing a trait is the following:
impl TraitName for TypeImplementing the
Visibletrait for theBroomtype might look like this:#![allow(unused)] fn main() { impl Visible for Broom { fn draw(&self, canvas: &mut Canvas) { for y in self.y - self.height - 1 .. self.y { canvas.write_at(self.x, y, '|'); } canvas.write_at(self.x, self.y, 'M'); } fn hit_test(&self, x: i32, y: i32) -> bool { self.x == x && self.y - self.height - 1 <= y && y <= self.y } } }Default Methods
Term: Methods listed within traits can have default implementations. In such cases, it's not required that a type implementing the trait explicitly define the method.
Traits and Other People's Types
Rule: Rust lets you implement any trait on any type, as long as either the trait or the type is introduced in the current trait. This is called the coherence rule. It helps Rust ensure that trait implementations are unique.
Term: A trait that adds a single method to a type is called an extension traits.
Generic
implblocks can be used to add an extension trait to a whole family of types at once.#![allow(unused)] fn main() { // Add the `write_html` method to all types that implement `Write` use std::io::{self, Write}; // Trait for values to which you can send HTML trait WriteHtml { fn write_html(&mut self, &HtmlDocument) -> io::Result<()>; } // Add the HTML write capability to any std:io writer impl<W: Write> WriteHtml for W { fn write_html(&mut self, html: &HtmlDocument) -> io::Result<()> { ... } } }Self in Traits
Traits can use the keyword
Selfas a type.#![allow(unused)] fn main() { pub trait Clone { fn clone(&self) -> Self; } }A trait that uses the
Selftype is incompatible with trait objects.#![allow(unused)] fn main() { // ERR: the trait `Spliceable` cannot be made into an object fn splice_anything(left: &Spliceable, right: &Spliceable) { let combo = left.splice(right); ... } }Subtraits
We can declare that a trait is an extension of another trait.
#![allow(unused)] fn main() { // A living item in our videogame world trait Creature: Visible { fn position(&self) -> (i32, i32); fn facing(&self) -> Direction; ... } }Static Methods
Traits can include static methods and constructors.
#![allow(unused)] fn main() { trait StringSet { // constructor fn new() -> Self; // static method fn from_slice(strings: &[&str]) -> Self; } }Trait objects don't support static methods.
Fully Qualified Method Calls
Term: A qualified method call is one that specifies the type or trait that a method is associated with. A fully qualified method call is one that specifies both type and trait.
Method Call Qualification "hello".to_string()str::to_string("hello")ToString::to_string("hello")qualified <str as ToString>::to_string("hello")fully qualified When You Need Them
Generally, you'll use
value.method()to call a method, but occasionally you'll need a qualified method call:
- When two methods have the same name:
#![allow(unused)] fn main() { // Outlaw is a type that implements Visible and HasPistol, both of which have a `draw` method let mut outlaw = Outlaw::new(); outlaw.draw(); // ERR: draw on the screen or draw pistol? Visible::draw(&outlaw); // OK! HasPistol::draw(&outlaw); // OK! }
- When the type of the
selfargument can't be inferred:#![allow(unused)] fn main() { let zero = 0; // all we know so far is this could be i8, u8, i32, etc zero.abs(); // ERR: which `.abs()` should be called? i64::abs(zero); // OK! }
- When using the function itself as the function value:
#![allow(unused)] fn main() { let words: Vec<String> = line.split_whitespace() .map(<str as ToString>::to_string) // OK! .collect(); }
- When calling trait methods in macros.
Traits That Define Relationships Between Types
Traits can be used in situations where there are multiple types that have to work together. They can describe relationships between types.
Associated Types (or How Iterators Work)
Rust's standard iterator trait looks a little like this:
#![allow(unused)] fn main() { trait Iterator { type Item; fn next(&mut self) -> Option<Self::Item>; } }Term: In the
Iteratortrait,Itemis called an associated type. Each type that implementsIteratormust specify what type of item it produces.The implementation of
Iteratorforstd::io::Argslooks a bit like this:#![allow(unused)] fn main() { impl Iterator for Args { // the associated `Item` type for `Args` is a `String` type Item = String; fn next(&mut self) -> Option<String> { ... } } }Bounds can be placed on a trait's associated type.
#![allow(unused)] fn main() { fn dump<I>(iter: I) where I: Iterator, I::Item: Debug { ... } }Or, we can place bounds on an associated type as if it were a generic type parameter of the trait:
#![allow(unused)] fn main() { fn dump<I>(iter: I) where I: Iterator<Item=String> { ... } }Use Case: Associated types are perfect for cases where each implementation has one specific related type.
Generic Traits (or How Operator Overloading Works)
The trait signature for Rust's multiplication method looks a bit like this:
#![allow(unused)] fn main() { pub trait Mul<RHS=Self> { ... } }The syntax
RHS=Selfmeans that the type parameterRHSdefaults toSelf.Buddy Traits (or How
rand::random()Works)Term: Traits that are designed to work together are called buddy traits.
A good example of buddy trait use is in the
rand, particularly therandom()method, which returns a random value:#![allow(unused)] fn main() { let x = rand::random(); }Rust wouldn't be able to infer the type of
xso we'd need to specify it with turbofish:#![allow(unused)] fn main() { let x = rand::random::<f64>(); // float between 0.0 and 1.0 let b = rand::random::<bool>(); // true or false }But
randhas many different kinds of random number generators (RNGs). They all implement the same trait,Rng:#![allow(unused)] fn main() { // An Rng is just a value that can spit out integers on demand, pub trait Rng { fn next_u32(&mut self) -> u32; ... } }There are lots of implementations of
Rng:XorShiftRing,OsRng, etc.The
Rnghas a buddy trait calledRand:#![allow(unused)] fn main() { // A type that can be randomly generated using an `Rng` pub trait Rand: Sized { fn rand<R: Rng>(rng: &mut R) -> Self; } }
Randis implemented by the types that are produced byRng:u64,bool, etc.Ultimately,
rand::random()is just a thin wrapper that passes a globally allocatedRngtoRand::rand():#![allow(unused)] fn main() { pub fn random<T: Rand>() -> T { T::rand(&mut global_rng()) } }Concept: When you see traits that use other traits as bounds, the way
Rand::rand()usesRng, you know those two traits are mix-and-match (buddy traits). AnyRngcan generate values of everyRandtype.Operator Overloading
Go here for references.
Arithmetic and Bitwise Operators
Here's the definition of
std::ops::Add:#![allow(unused)] fn main() { trait Add<RHS=Self> { type Output; fn add(self, rhs: RHS) -> Self::Output; } }In other words, the trait
Add<T>is the ability to add aTvalue to yourself.We could implement
Addgenerically for theComplexnumber type like this:#![allow(unused)] fn main() { use std::ops::Add; impl<T> Add for Complex<T> where T: Add<Output=T> { type Output = Self; fn add(self, rhs: Self) -> Self { Complex { re: self.re + rhs.re, im: self.im + rhs.im } } } }Unary Operators
The two overloadable unary operators (
!and-) are defined like this:#![allow(unused)] fn main() { trait Not { type Output; fn not(self) -> Self::Output; } trait Neg { type Output; fn neg(self) -> Self::Output; } }An implementation of
NegforComplexvalues might look like this:#![allow(unused)] fn main() { impl<T, O> Neg for Complex<T> where T: Neg<Output=O> { type Output = Complex<O>; fn neg(self) -> Complex<O> { Complex { re: -self.re, im: -self.im } } } }Binary Operators
The definition of
std::ops::BitXorlooks like this:#![allow(unused)] fn main() { trait BitXor<RHS=Self> { type Output; fn bitxor(self, rhs: RHS) -> Self::Output; } }Compound Assignment Operators
Warning: Unlike other languages, the value of a compound assignent expression is always
(). e.g.x += yreturns().The definition of
std::ops::AddAssignlooks like this:#![allow(unused)] fn main() { trait AddAssign<RHS=Self> { fn add_assign(&mut self, RHS); } }An implementation of
AddAssignforComplexvalues might look like this:#![allow(unused)] fn main() { impl<T> AddAssign for Complex<T> where T: AddAssign<T> { fn add_assign(&mut self, rhs: Complex<T>) { self.re += rhs.re; self.im += rhs.im; } } }Warning: Overloading an arithmetic operator like
Adddoes not automatically include overload implementation for its correspondingAddAssignoperator.Equality Tests
Since the
nemethod of thePartialEqtrait already has a default implementation, you'll only ever need to implement theeqmethod.
PartialEqtakes its values by reference.Here's the definition of
std::cmp::PartialEq:#![allow(unused)] fn main() { trait PartialEq<Rhs: ?Sized = Self> { fn eq(&self, other: &Rhs) -> bool; // `ne` has a default implementation fn ne(&self, other: &Rhs) -> bool { !self.eq(other) } } }Syntax: The
where Rhs: ?Sizedbound relaxxs Rust's usual requirement that type parameters must be sized types, which lets us write traits likePartialEq<str>orPartialEq<[T]>.Tip: In most cases, Rust can automatically implement
PartialEqfor your type for you if you add#[Derive(PartialEq)].Ordered Comparisons
Ordered comparison operators all stem from the
std::cmp::PartialOrdtrait, which is defined as:#![allow(unused)] fn main() { trait PartialOrd<Rhs = Self>: PartialEq<Rhs> where Rhs: ?Sized { fn partial_cmp(&self, other: &Rhs) -> Option<Ordering>; fn lt(&self, other: &Rhs) -> bool { ... } fn le(&self, other: &Rhs) -> bool { ... } fn gt(&self, other: &Rhs) -> bool { ... } fn ge(&self, other: &Rhs) -> bool { ... } } }Note that
PartialOrdis a subtrait ofPartialEq. Meaning you can perform ordered comparison only on types that can also be checked for equality.Also note that
partial_cmpis the only method of thePartialOrdtrait that doesn't have a default implementation. This means when you want to implementPartialOrd, you only need to definepartial_cmp.
IndexandIndexMutHere are the definitions of the traits associated with the index operator:
#![allow(unused)] fn main() { trait Index<Idx> { type Output: ?Sized; fn index(&self, index: Idx) -> &Self::Output; } trait IndexMut<Idx>: Index<Idx> { fn index_mut(&mut self, index: Idx) -> &mut Self::Output; } }The associated type
Outputspecifies what an index expression returns.Use Case: The most common use case for indexing and overloading the index operators is for collections.
In the Mandelbrot program, we accessed pixels with lines like this:
#![allow(unused)] fn main() { // current implementation treats pixels as a single row pixels[row * bounds.0 + column] = ...; // UGLY // what we want is to be able to access pixels as if it were a 2D array image[row][column] = ...; // BETTER! }To achieve improved indexing in the above code, we could write something like this:
#![allow(unused)] fn main() { // declare a struct that holds the pixels and the image dimensions #[derive(Debug)] struct Image<P> { width: usize, pixels: Vec<P>, } // add a static constructor to the Image type // the type parameter P is the pixel type impl<P> Image<P> where P: Default + Copy { fn new(width: usize, height: usize) -> Image<P> { Image { width, pixels: vec![P::default(); width * height], } } fn height(&self) -> usize { self.pixels.len() / self.width } } // now we implement Index and IndexMut // when we index into an Image<P>, we expect to get back a slice of P // indexing the slice will give an individual pixel impl<P> std::ops::Index<usize> for Image<P> { type Output = [P]; fn index(&self, row: usize) -> &Self::Output { let start = row * self.width; &self.pixels[start .. start + self.width] } } impl<P> std::ops::IndexMut<usize> for Image<P> { fn index_mut(&mut self, row: usize) -> &mut [P] { let start = row * self.width; &mut self.pixels[start .. start + self.width] } } // Create an image 3 pixels wide and 3 pixels tall let mut image = Image::<u32>::new(3, 3); println!("image height {}", image.height()); // Draw a diagonal line through the image for i in 0 .. image.width { image[i][i] = 255; } println!("{:?}", image); }Other Operators
The dereferencing operator (
*val) and the dot operator for accessing fields and calling methods (val.fieldandval.method()), can be overloaded using theDerefandDerefMuttraits.Utility Traits
DropYou can customize hwo Rust drops values of your type by implementing the
std::ops::Droptrait:#![allow(unused)] fn main() { trait Drop { fn drop(&mut self); } }Implicit Behavior: The
dropmethod of theDroptrait is called implicity by Rust, if you try to call it yourself, it'll be flagged as an error.You'll never need to implement
Dropunless you're defining a type that owns resources Rust doesn't already know about.Warning: If a type implements
Drop, it cannot implementCopy.
SizedTerm: A type whose values all have the same size in memory is called a sized type. In Rust, almost all types are sized types.
All sized types implement the
std::marker::Sizedtrait, which has no methods nor associated types. Rust implements it automatically for all types to which it applies; you can't implement it yourself.Use Case: The only use for the
Sizedtrait is as a bound for type variables: a bound likeT: SizedrequiresTto be a type whose size is known at compile time.Term: A trait that can only be used as a type parameter bound, and cannot be explicitly implemented (like
Sized), is called a marker trait.Implicit Behavior: Since unsized types are so limited, Rust implicitly assumes that generic type parameters have a
Sizedbound. This mean that when you writestruct S<T>, Rust assumes you meanstruct S<T: Sized>.Syntax: Since Rust assumes all type parameters have a
Sizedbound, you have to explicitly opt-out of it using the?Sizedsyntax:struct S<T: ?Sized>.
CloneThe
std::clone::Clonetrait is for types that can make copies of themselves. It's a subtrait ofSizedand is defined like this:#![allow(unused)] fn main() { trait Clone: Sized { fn clone(&self) -> Self; fn clone_from(&mut self, source: &Self) { *self = source.clone() } } }Warning: Cloning values can be computationally expensive!
The
clone_frommethod modifiesselfinto a copy ofsource.Convention: In generic code, you should use
clone_fromwhenever possible.Tip: If your
Cloneimplementation simply appliescloneto each field of your type, then Rust can implement it for you by adding#[derive(Clone)]above your type definition.Warning: The
clonemethod of types that implementClonemust be infallible!
CopyA type is
Copyif it implements thestd::marker::Copymarker trait, a subtrait ofCloneand defined as:#![allow(unused)] fn main() { trait Copy: Clone {} }Tip: Like
Clone,Copycan be automatically implemented using#[derive(Copy)].
DerefandDerefMutYou can specify how dereferencing operators like
*and.behave on your types by implementing thestd::ops::Derefandstd::ops::DerefMuttraits:#![allow(unused)] fn main() { trait Deref { type Target: ?Sized; fn deref(&self) -> &Self::Target; } // DerefMut is a subtrait of Deref trait DerefMut: Deref { fn deref_mut(&mut self) -> &mut Self::Target; } }Term: If inserting a
derefcall prevents a type mismatch, Rust insterts one for you. These are called deref coercions: one type is "coerced" into behaving as another.Add notes about the implications of deref coercions.
Rust will apply several deref coercions in succession if necessary.
Use Case: The
DerefandDerefMuttraits are designed for implementing smart pointer types likeBox,Rc, andArc, and types that serve as owning versions of something you would frequently use by reference, the wayVec<T>andStringserve as owning versions of[T]and[str].Anti-Pattern: Do not implement
DerefandDerefMutfor a type just to make theTargettype's methods appear on it automatically, the way a C++ base class's methods are visible on a subclass.Warning: Rust applies deref coercions to resolve type conflicts, but it does not apply them to satisfy bounds on type variables.
Example
Say we have a struct called
Selector<T>that has a fieldelements: Vec<T>and a field namedcurrent: usizethat behaves like a pointer to the current element.Given a value
sof typeSelector<T>, we want to be able to do these things:
- Use the expression
*sto get the value of the current element- Apply methods implemented by the type of the currently pointed to element
- Change the value of the currently pointed to element with
*s = '?'#![allow(unused)] fn main() { use std::ops::{Deref, DerefMut}; struct Selector<T> { elements: Vec<T>, current: usize, } // Implementing Deref allows us to use *s to get the current element impl<T> Deref for Selector<T> { type Target = T; fn deref(&self) -> &T { &self.elements[self.current] } } // Implementing DerefMut allows us to set the value of the current element impl<T> DerefMut for Selector<T> { fn deref_mut(&mut self) -> &mut T { &mut self.elements[self.current] } } let mut s = Selector { elements: vec!['x', 'y', 'z'], current: 2 }; println!("current element {}", *s); println!("is alphabetic? {}", s.is_alphabetic()); *s = 'w'; println!("current element {}", *s); }
DefaultTypes with a reasonably obvious default value can implement the
std::default::Defaulttrait:#![allow(unused)] fn main() { trait Default { fn default() -> Self; } }All of Rust's collection types (like
Vec,HashMap,BinaryHeap, etc) implementDefault, withdefaultmethods that return an empty collection.Use Case:
Defaultis commonly used to produce default values for structs that represent a large collection of parameters, most of which you won't usually want to change. (thinkoptionsobjects in JS)A perfect example of making good use of
Defaultis when using the OpenGL crate calledglium. Drawing with OpenGL requires a ton of parameters, most of which you don't care about. So, we can useDefaultto provide those parameters for us:#![allow(unused)] fn main() { let params = glium::DrawParameters { line_width: Some(0.02), point_size: Some(0.02), .. Default::default(), } target.draw(..., ¶ms).unwrap(); }Tip: Rust does not implicitly implement
Defaultfor struct types, but if all of a struct's fields implementDefault, you can implementDefaultfor the struct automatically using#[derive(Default)].
AsRefandAsMutWhen a type implements
AsRef<T>, that means you can borrow a&Tfrom it efficiently;AsMutis the analogue for mutable references:#![allow(unused)] fn main() { trait AsRef<T: ?Sized> { fn as_ref(&self) -> &T; } trait AsMut<T: ?Sized> { fn as_ref(&mut self) -> &mut T; } }Use Case:
AsRefis typically used to make functions more flexible in the argument types they accept. For instance,std::fs::File::openis declared like this:#![allow(unused)] fn main() { fn open<P: AsRef<Path>>(path: P) -> Result<File>; }Concept: You can think of using
AsRefas a type bound as a bit like function overloading.In the above use case, what
openreally wants is a&Path, the type representing a filesystem path. But with the above declaration,openaccepts anything it can borrow a&Pathfrom--that is, anything that implementsAsRef<Path>.Use Case: It only makes sense for a type to implement
AsMut<T>if modifying the givenTcannot violate the type's invariants.Anti-Pattern: You should avoid defining your own
AsFootraits when you could just implementAsRef<Foo>instead.
BorrowandBorrowMutThe
std::borrow::Borrowtrait is similar toAsRef: if a type implementsBorrow<T>, then itsborrowmethod efficiently borrows a&Tfrom it. ButBorrowimposes more restrictions.Use Case: A type should implement
Borrow<T>ony when a&Thashes and compares the same way as the value it's borrowed from.Borrowis valuable in dealing with keys in hash tables and trees, or when dealing with values that will be hashed or compared for some other reason.Come back to the example implementation related to
HashMap.
FromandIntoThe
std::convert::Fromandstd::convert::Totraits represent conversions that consume a value of one type, and return a value of another.
FromandIntodo not borrow; they take ownership of their argument, transform it, and then return ownership of the result back to the caller:#![allow(unused)] fn main() { trait Into<T>: Sized { fn into(self) -> T; } trait From<T>: Sized { fn from(T) -> Self; } }Use Case:
Intois generally used to make functions more flexible in the arguments they accept. For example, in this code, thepingfunction can accept any typeAthat can be converted into anIpv4Addr:#![allow(unused)] fn main() { fn ping<A>(address: A) -> std::io::Result<bool> where A: Into<Ipv4Addr> { let ipv4_address = address.into(); ... } }Use Case: The
frommethod ofFromserves as a generic constructor for producing an instance of a type from some other single value.Tip: Given an appropriate
Fromimplementation, you get theIntotrait implementation trait for free!Warning:
fromandintooperations must be infallible!
ToOwnedThe
ToOwnedtrait is an alternative toClone. Types that implementClonemust be sized. Thestd::borrow::ToOwnedtrait provides a slightly looser way to convert a reference to an owned value:#![allow(unused)] fn main() { trait ToOwned { type Owned: Borrow<Self>; fn to_owned(&self) -> Self::Owned; } }
BorrowandToOwnedat Work: The HumbleCowIn some cases, you cannot decide whether to borrow or own a value until the program is running. The
std::borrow::Cowtype ("clone on write") provides one way to do this:#![allow(unused)] fn main() { enum Cow<'a, B: ?Sized + 'a> where B: ToOwned { Borrowed(&'a B), Owned(<B as ToOwned>::Owned), } }Concept: A
Cow<B>either borrows a shared reference toB, or owns a value from which we could borrow such a reference.Use Case: One common use for
Cowis to return either a statically allocated string constant or a computed string.Example
Suppose you need to convert an error enum to a message via a function called
describe.Most variants can be handled with fixed strings, but others have additional data that should be included in the message. For such a case, you can return
Cow<'static, str>.Using
Cowhelpsdescribeand its callers put off allocation until the moment it becomes necessary.#![allow(unused)] fn main() { use std::borrow::Cow; fn describe(error: &Error) -> Cow<'static, str> { match *error { Error::OutOfMemory => "out of memory".into(), Error::StackOverflow => "stack overflow".into(), Error::MachineOnFire => "machine on fire".into(), Error::Unfathomable => "machine bewildered".into(), Error::FileNotFound(ref path) => { format!("file not found: {}", path.display()).into() } } } }Closures
Capturing Variables
Closures that Borrow
Simple example:
#![allow(unused)] fn main() { fn sort_by_statistic(cities: &mut Vec<City>, stat: Statistic) { cities.sort_by_key(|city| -city.get_statistic(stat)); } }Closures that Steal
Say we wanted to create a function that sorts a list of cities in a separate thread. It might look something like this:
#![allow(unused)] fn main() { fn start_sorting_thread(mut cities: Vec<City>, stat: Statistic) -> thread::JoinHandle<Vec<City>> { // take ownership of stat let key_fn = move |city: &City| -> i64 { -city.get_statistic(stat) }; // take ownership of cities and key_fn thread::spawn(move || { cities.sort_by_key(key_fn); cities }) } }In the above example, we had to add the
movekeyword before each closure.Keyword: The
movekeyword tells Rust that a closure doesn't borrow the variables it uses: it steals them.Rust therefore offers two ways for closures to get data from enclosing scopes:
- Moves
- Borrowing
Function and Closure Types
Structs may have function-typed fields.
In memory, function values are just the memory address of the function's machine code.
A function can take another function as an argument:
#![allow(unused)] fn main() { // Given a list of cities and a test function, // return the number of cities that returned true from the test fn count_selected_cities( cities: &Vec<City>, test_fn: fn(&City) -> bool, ) -> usize { let mut count = 0; for city in cities { if test_fn(city) { count += 1; } } count } }Concept: Closures do not have the same type as functions!
Term: A value of type
fn(&City) -> boolis called a function pointer.The
count_selected_cities's type signature must be changed iftest_fnshould be a closure instead of a function value:#![allow(unused)] fn main() { fn count_selected_cities<F>(cities: &Vec<City>, test_fn: F) -> usize where F: Fn(&City) -> bool { let mut count = 0; for city in cities { if test_fn(city) { count += 1; } } count } }We've now genericized
count_selected_cities. It'll accepttest_fnof any typeF, as long asFimplements the special traitFn(&City) -> bool.Concept: Every closure has its own type, because a closure may contain data: values either borrowed or stolen from enclosing scope. So, every closure has an ad hoc type created by the compiler. But, every closure implements the
Fntrait.Closure Performance
Closures aren't allocated on the heap unless you put them in a
Box,Vec, or other container.Closures and Safety
Closures that Kill
Basically, double free errors are impossible in Rust.
FnOnceConcept: Closures that drop values are not allowed to have
Fn. Instead, they implementFnOnce, the trait of closures that can only be called once. The first time you call aFnOnceclosure, the closure itself is used up.
FnMutConcept: Closures that require
mutaccess to a value, but don't drop any values, areFnMutclosures.Summary
These are the three categories of closures, in order of most broad to least:
Trait Description FnOnceCan only be called once, if the caller owns the closure. FnMutCan be called multiple times if the closure itself is declared mut.FnCan be called multiple times without restriction and also encompasses all fnfunctions.Callbacks
Here's an example program that implements a basic router:
#![allow(unused)] fn main() { struct Request { method: String, url: String, headers: HashMap<String, String>, body: Vec<u8>, } struct Response { code: u32, headers: HashMap<String, String>, body: Vec<u8>, } type RouteCallback = Box<Fn(&Request) -> Response>; struct BasicRouter { routes: HashMap<String, RouteCallback>, } impl BasicRouter { fn new() -> BasicRouter { BasicRouter { routes: HashMap::new() } } fn add_route<C>(&mut self, url: &str, callback: C) where C: Fn(&Request) -> Response + 'static { self.routes.insert(url.to_string(), Box::new(callback)) } fn handle_request(&self, request: &Request) -> Response { match self.routes.get(&request.url) { None => not_found_response(), Some(callback) => callback(request), } } } }Iterators
The
IteratorandIntoIteratorTraitsTerm: An iterator is any value that implements the
std::iter::Iteratortrait. Put simply, an iterator is value that produces a sequence of values.Term: The values an iterator produces are called items.
Term: The code that receives an iterator's items is called a consumer.
The heart of the
Iteratortrait is defined as:#![allow(unused)] fn main() { trait Iterator { type Item; // the type of value the iterator produces fn next(&mut self) -> Option<Self::Item>; // ... a whole bunch of default methods } }If there's a natural way to iterator over some type, it can implement
std::iter::IntoIterator, whoseinto_itermethod takes a value and returns an iterator over it:#![allow(unused)] fn main() { trait IntoIterator where Self::IntoIter::Item == Self::Item { type Item; // the type of value the iterator produces type IntoIter: Iterator; // the type of the iterator value itself fn into_iter(self) -> Self::IntoIter; } }Term: Any type that implements
std::iter::IntoIteratoris called an iterable.Under the hood, every for loop is just shorthand for calls to
IntoIteratorandIteratormethods:#![allow(unused)] fn main() { let elems = vec!["antimony", "arsenic", "aluminum", "selenium"]; // Iteration using a for loop.. println!("There's:"); for element in &elems { println!("- {}", element); } // Is actually just... println!("Again! There's:"); let mut iterator = (&elems).into_iter(); while let Some(element) = iterator.next() { println!("- {}", element); } }Implicit Behavior: All iterators automatically implement
IntoIterator, with aninto_itermethod that simply returns the iterator.Creating Iterators
iteranditer_mutMethodsMost collection types provide
iteranditer_mutmethods that return the natural iterators over the type, producing a shared or mutable reference to each item. The same applies to slices like&[T]and&strtoo.
IntoIteratorImplementationsThere are three main implementations of
IntoIterator.1. Shared Reference
Idiom: Given a shared reference to a collection,
into_iterreturns an iterator that produces shared references to its items.#![allow(unused)] fn main() { for element in &collection { ... } }2. Mutable Reference
Idiom: Given a mutable reference to a collection,
into_iterreturns an iterator that produces mutable references to the items.#![allow(unused)] fn main() { for element in &mut collection { ... } }3. By Value
Idiom: When passed a collection by value,
into_iterreturns an iterator that takes ownership of the collection and returns items by value; the items' ownership moves from the collection to the consumer, and the original collection is consumed in the process.#![allow(unused)] fn main() { for element in collection { ... } }Not every type provides all three iterator implementations.
Slices implement two of the three
IntoIteratorvariants; since they don't own their elements, there is no "by value" case.Use Case:
IntoIteratorcan be useful in generic code: you can use a bound likeT: IntoIteratorto restrict the tyep variableTto types that can be iterator over. Or, you can writeT: IntoIterator<Item=U>to further require the iteration to produce a particular typeU. For instance, we can create adumpfunction that receives an iterable whose items implement theDebugtrait:#![allow(unused)] fn main() { use std::fmt::Debug; fn dump<T, U>(t: T) where T: IntoIterator<Item=U>, U: Debug { for u in t { println!("{:?}", u); } } dump(vec!["garbage", "rubbish", "waste"]); }
drainMethodsA lot of collection types provide a
drainmethod that takes a mutable reference to the collection and returns an iterator that passes ownership of each element to the consumer.#![allow(unused)] fn main() { use std::iter::FromIterator; let mut outer = "Earth".to_string(); let inner = String::from_iter(outer.drain(1..4)); println!("outer: {}", outer); println!("inner: {}", inner); }If you need to drain an entire sequence, use the full range,
.., as the argument.Other Iterator Sources
Iterator Adapters
Term: Given an iterator, the
Iteratortrait provides a huge selection of methods called adapters that consume one iterator and build a new one.
mapandfilterA
mapiterator passes each item to its closure by value, and in turn, passes along ownership of the closure's result to its consumer.A
filteriterator passes each item to its closure by shared reference, retaining ownership in case the item is selected to be passed on to its consumer.Concept: Calling an adapter on an iterator doesn't consume any items; it just returns a new iterator. The only way to actually get values is to call
next(or some other indirect method, likecollect, in which case no work takes place untilcollectstarts callingnext) on the iterator.
filter_mapandflat_mapThe
filter_mapadapter is similar tomap, except that it lets its closure either transform the item into a new item or drop the item from the iteration. Thus, it's a bit like a combination offilterandmap.Use Case: The
filter_mapadapter is best in situations when the best way to decide whether to include an item in the iteration is to actually try to process it.The
flat_mapiterator produces the concatenation of the sequences the closure returns.
scanThe
scanadapter resemblesmap, except that the closure is given a mutable value it can consult, and has the option of terminating the iteration early. The closure must return anOption, which thescaniterator takes as its next item.
takeandtake_whileThe
Iteratortrait'stakeandtake_whileadapters let you end an iteration after a certain number of items, or when a closure decides to cut things off.Both
takeandtake_whiletake ownership of an iterator and return a new iterator that passes along items from the first one, possible ending the sequence earlier.
skipandskip_whileThe
Iteratortrait'sskipandskip_whilemethods are the complement oftakeandtake_while: they drop a certain number of items from the beginning of an iteration, or drop items until a closure finds one acceptable, and then pass the remaining items through unchanged.Use Case: One common use for
skipis to skip the command name when iterating over a programs command-line arguments:#![allow(unused)] fn main() { for arg in std::end::args().skip(1) { println!("arg: {}", arg); } }
peekableA peekable iterator lets you peek at the next item that will be produced without actually consuming it. Almost any iterator can be turned into a peekable iterator by calling the
Iteratortrait'speekablemethod.Calling
peektries to draw the next item from the underlying iterator, and if there is one, caches it until the next call tonext.Use Case: Peekable iterators are essential when you can't decide how many items to consume from an iterator until you've gone too far. For example, if you're parsing numbers from a stream of characters, you can't decide where the number ends until you've seen the first non-number character:
#![allow(unused)] fn main() { use std::iter::Peekable; fn parse_number<I>(tokens: &mut Peekable<I>) -> u32 where I: Iterator<Item=char> { let mut n = 0; loop { match tokens.peek() { Some(r) if r.is_digit(10) => { n = n * 10 + r.to_digit(10).unwrap(); } _ => return n } tokens.next(); } } let mut chars = "10212980".chars().peekable(); println!("{}", parse_number(&mut chars)); }
fuseThe
fuseadapter takes any iterator and turns it into one that will definitely continue to returnNoneonce it has done so the first time.Use Case: The
fuseadapter is most useful in generic code that needs to work with iterators of an uncertain origin.Reversible Iterators and
revSome iterators are able to draw items from both ends of the sequence. You can reverse these iterators by using the
revadapter.Most iterator adapters, if applied to a reversible iterator, return another reversible iterator.
inspectThe
inspectadapter is handy for debugging pipelines of iterator adapters, but is rarely used in production code. It applies a closure to a shared reference to each item, and then passes the item through. The closure can't affect the items, but it can do things like print them or make assertions about them.
chainThe
chainadapter appends one iterator to another (think of theconcatoperator in RxJS). Achainiterator keeps track of whether each of the two underlying iterators has returnNone, and directsnextandnext_backcalls to one or the other as appropriate.#![allow(unused)] fn main() { let v: Vec<_> = (1..4).chain(4..6).collect(); println!("{:?}", v); }
enumerateThe
Iteratortrait'senumerateadapter attaches a running index to the sequence, taking an iterator that produces itemsA, B, C, ...and returning an iterator that produces pairs(0, A), (1, B), (2, C), ....
zipThe
zipadapter combines two iterators into a single iterator that produces pairs holding one value from each iterator. The zipped iterator ends when either of the two underlying iterators ends.
by_refAn iterator's
by_refmethod borrows a mutable reference to the iterator, so that you can apply adaptors to the reference. When you're done consuming items from these adaptors, you drop them, the borrow ends, and you regain access to the original iterator.
by_refessentially provides a mechanism for starting and stopping iterators as needed.Concept: When you call an adapter on a mutable reference to an iterator, the adapter takes ownership of the reference, not the iterator itself.
clonedThe
clonedadapter takes an iterator that produces references, and returns an iterator that produces values cloned from those references.
cycleThe
cycleadapter returns an iterator that endlessly repeats the sequence produced by the underlying iterator. The underlying iterator must implementstd::clone::Clone, so thatcyclecan save its initial state and reuse it each time the cycle starts again.#![allow(unused)] fn main() { use std::iter::{once, repeat}; let fizzes = repeat("").take(2).chain(once("fizz")).cycle(); let buzzes = repeat("").take(4).chain(once("buzz")).cycle(); let fizzes_buzzes = fizzes.zip(buzzes); let fizz_buzz = (1..100).zip(fizzes_buzzes) .map(|tuple| match tuple { (i, ("", "")) => i.to_string(), (_, (fizz, buzz)) => format!("{}{}", fizz, buzz) } ); for line in fizz_buzz { println!("{}", line); } }Consuming Iterators
Simple Accumulation:
count,sum,productThe
countmethod draws items from an iterator until it returnsNone, and tells you how many it got.The
sumandproductmethods compute the sum or product of the iterator's items, which must be integers or floating-point numbers.
max,minThe
maxandminmethods onIteratorreturn the least or greatest item the iterator produces. The iterator's item type must implementstd::cmp::Ord, so that items can be compared with each other.An implication of the
Ordbound is that these methods can't be used with floating-point values.
max_by,min_byThe
max_byandmin_bymethods return the maximum or minimum item an iterator produces, as determined by a comparator function you provide.
max_by_key,min_by_keyThe
max_by_keyandmin_by_keymethods onIteratorlet you select the maximum or minimum item as determined by a closure applied to each item.Comparing Item Sequences
anyandallThe
anyandallmethods apply a closure to each item the iterator produces, and returntrueif the closure returnstruefor any or all items, respectively.These methods consume only as many items as they need to determine the answer.
position,rposition, andExactSizeIteratorThe
positionmethod applies a closure to each item from the iterator and returns the index of the first item for which the closure returnstrueas anOption<bool>.The
rpositionmethods does the same thing but in reverse.Term: An exact-size iterator is one that implements the
std::iter::ExactSizeIteratortrait:#![allow(unused)] fn main() { trait ExactSizeIterator: Iterator { fn len(&self) -> usize { ... } fn is_empty(&self) -> bool { ... } } }
foldThe
foldmethod is a very general tool for accumulating some sort of result over the entire sequence of items an iterator produces.
nthThe
nthmethod takes an indexn, skips that many items from the iterator, and returns the next item, orNoneif the sequence ends before that point.Calling
.nth(0)is equivalent to calling.next().It doesn't take ownership of the iterator the way an adapter does, so you can call it many times.
lastThe
lastmethod consumes items until the iterator returnsNone, and then returns the last item. If the iterator produces no items, thenlastreturnsNone.Tip: If you have a reversible iterator and just want the last item, use
iter.rev().next()instead.
findThe
findmethod draws items from an iterator, returning the first item for which the given closure returnstrue, orNoneif the sequence ends before a suitable item is found.Building Collections:
collectandFromIteratorAn iterator's
collectmethod can build any kind of collection from the Rust's standard library, as long as the iterator produces a suitable item type. The return type ofcollectis its type parameter.When some collection type like
VecorHashMapknows how to construct itself from an iterator, it implements thestd::iter::FromIteratortrait, for whichcollectis a method:#![allow(unused)] fn main() { trait FromIterator<A>: Sized { fn from_iter<T: IntoIterator<Item=A>>(iter: T) -> Self; } }Concept: If a collection type implements
FromIterator<A>, then its static methodfrom_iterbuilds a value of that type from an iterable producing items of typeA.The
size_hintmethod ofIteratorreturns a lower bound and optional upper bound on the number of items the iterator will produce.The
ExtendTraitIf a type implements the
std::iter::Extendtrait, then itsextendmethod adds an iterable's items to the collection.All of the standard collections implement
Extend. Arrays and slices do not have this method because they are not of fixed length.
partitionThe
partitionmethod divides an iterator's items among two collections, using a closure to decide where each item belongs.Whereas
collectrequires its result type to implementFromIterator,partitioninstead requiresstd::default::Defaultandstd::default::Extend.Collections
Strings and Text
Input and Output
Concept: All I/O in Rust is organized around 4 traits, owned by
std::io:
Read: Defines methods for byte-oriented input. Implementers are called readers.BufRead: IncludesReadmethods, plus methods for reading lines of text and so forth. Implementers are called buffered readers.Write: Defines methods for both byte-oriented and UTF-8 text output. Implementers are called writers.Shortcut: All 4 traits are so commonly used that they can there's a prelude module containing only them. Just add:
#![allow(unused)] fn main() { use std::io::prelude::*; }Readers and Writers
One of the simplest, most low-level implementation of both
ReadandWriteis a function that copies data from any reader to any writer:#![allow(unused)] fn main() { use std::io::{self, Read, Write, ErrorKind}; const DEFAULT_BUF_SIZE: usize = 8 * 1024; fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64> where R: Read, W: Write { let mut buf = [0; DEFAULT_BUF_SIZE]; let mut written = 0; loop { let len = match reader.read(&mut buf) { Ok(0) => return Ok(written), Ok(len) => len, Err(ref e) if e.kind() == ErrorKind::Interrupted => continue, Err(e) => return Err(e), }; writer.write_all(&buf[..len])?; written += len as u64; } } }Shortcut: The import statement
use std::io::{self};declaresioas an alias to thestd::iomodule, which means we can write things likestd::io::Resultas justio::Result.Readers
All main methods defined by
Readtake the reader itself bymutreference. There are also four adapter methods that take thereaderby value and transform it into an iterator or a different reader.Note that there is no method for closing a reader. Readers and writers implement
Drop, so they are closed automatically.
Reader Method
reader.read(buffer)#![allow(unused)] fn main() { fn read(&mut self, buf: &mut [u8]) -> Result<usize> }Reads an undefined number of bytes from the data source and stores them in the given buffer. The
usizesuccess value is the number of bytes read, which might be less than or equal tobuffer.len(), even if there's still more data to read.If
readreturnsOk(0), theres no more input to read.On error,
readreturnsErr(err), whereerris anio::Errorvalue.io::Errorsare printable for humans. For computers, you should use the.kind()method, which returns an error code of typeio::ErrorKind.
io::ErrorKindis an enum with lots of different types of errors. Most variants shouldn't be ignored because they indicate actual issues, but not all.io::ErrorKind::Interruptedcorresponds to theEINTRUNIX error code, which means the signal was interrupted and can in almost all scenarios be ignored.
Reader Method
reader.read_to_end(&mut byte_vec)#![allow(unused)] fn main() { fn read_to_end(&mut self, buf: &mut Vec<u8>) -> Result<usize> }Reads all remaining input from the reader into a vector.
There's no limit on the amount of data that
read_to_endwill return, so it's usually a good idea to impose a limit using.take().
Reader Method
reader.read_to_string(&mut string)#![allow(unused)] fn main() { fn read_to_string(&mut self, buf: &mut String) -> Result<usize> }Reads all remaining input from the reader into a string. If the source provides data that isn't valid UTF-8,
read_to_stringwill return anErrorKind::InvalidDataerror.
Reader Method
reader.read_exact(&mut buf)#![allow(unused)] fn main() { fn read_exact(&mut self, buf: &mut [u8]) -> Result<()> }Reads exactly enough data to fill the given buffer. If the reader runs out of data before reading
buf.len()bytes,read_exactreturns anErrorKind::UnexpectedEoferror.
Adapter
reader.bytes()#![allow(unused)] fn main() { fn bytes(self) -> Bytes<Self> where Self: Sized }Converts a reader into an iterator over the bytes of the input stream. The item types is
io::Result<u8>, so an error check is required for every byte. It callsreader.read()one byte at a time, so this method is super inefficient if the reader isn't buffered.
Adapter
reader.chars()#![allow(unused)] fn main() { fn chars(self) -> Chars<Self> where Self: Sized }Converts a reader into an iterator over the input stream as UTF-8 characters.
Adapter
reader.chain(reader2)#![allow(unused)] fn main() { fn chain<R: Read>(self, next: R) -> Chain<Self, R> where Self: Sized }Creates a new reader that produces all of the input from
reader, followed by all of the input fromreader2.
Adapter
reader.take(n)#![allow(unused)] fn main() { fn take(self, limit: u64) -> Take<Self> where Self: Sized }Creates a new reader that reads from the same source as
reader, but is limited tonbytes of input.Buffered Readers
Buffered readers implement both
ReadandBufRead, which provides three main methods.Come back and add the type signatures of the following methods.
Buffered Reader Method
reader.read_line(&mut line)Reads a line of text and appends it to
line, which is of typeString.The method returns an
io::Result<usize, io::Error>, whereusizeis the number of bytes read, including the line ending, if any.If the reader is at the end of the input,
linewill be unchanged and the method will returnOk(0).
☆ Buffered Reader Method
reader.lines()Returns an iterator over the lines of the input.
The item type is
io::Result<String, io::Error>. Newline characters are not included in the strings.
Buffered Reader Methods
reader.read_until(stop_byte, &mut byte_vec)andreader.split(stop_byte)Byte-oriented versions of
.read_line()and.lines(). ProducesVec<u8>instead ofStrings.Reading Lines
We can use
.lines()to create a function that implements the Unixgreputility. Our function receives a generic reader (ie anything that implementsBufRead).#![allow(unused)] fn main() { use std::io; use std::io::prelude::*; fn grep<R>(target: &str, reader: R) -> io::Result<()> where R: BufRead { for line_result in reader.lines() { let line = line_result?; if line.contains(target) { println!("{}", line); } } Ok(()) } }In the case that we want to use stdin as our source of data, we have to convert it to a reader using its
.lock()method like so:#![allow(unused)] fn main() { let stdin = io::stdin(); grep(&target, stdin.lock())?; // ok }If we wanted to use our function with the contents of a file, we could do so like this:
#![allow(unused)] fn main() { let f = File::open(file)?; grep(&target, BufReader::new(f))?; // also ok }Collecting Lines
Writers
To send output to a writer, use the
write!()andwriteln!()macros.#![allow(unused)] fn main() { writeln!(io::stderr(), "error: world not helloable")?; writeln!(&mut byte_vec, "The greated common divisor of {:?} is {}", numbers, d)?; }The
writemacros are the same as the
- The
writemacros take an extra first argument, a writer.- The
writemacros return aResult, so errors must be handled. When theThe
Writetrait has these methods:
Writer Method
writer.write(&buf)Writes some of the bytes in the slice
bufto the underlying stream.Returns an
io::Result<usize, io::Error>.On success, gives the number of bytes written, which may be less than
buf.len(), depending on the stream's mood.This is the lowest-level method and is usually not used in practice.
Writer Method
writer.write_all(&buf)Writes all the bytes in the slice
buf.Returns
Result<(), io::Error>.
Writer Method
writer.flush()Flushes any buffered data to the underlying stream.
Returns
Result<(), io::Error>.
Warning: When a
BufWriteris dropped, all remaining buffered data is written to the underlying writer. However, if an error occurs during this write, the error is ignored. To make sure errors don't get swallowed, always call.flush()on all buffered writers before dropping them.Files
We've got two main ways to open a file:
File Method
File::open(filename)Opens an existing file for reading. It's an error if the file doesn't exist.
Returns an
io::Result<File, io::Error>.
File Method
File::create(filename)Creates a new file for writing. If a file exists with the given filename, it gets truncated.
Returns an
io::Result<File, io::Error>.
There is an altertive that uses
OpenOptionsto specify the exact open behavior we want.#![allow(unused)] fn main() { use std::fs::OpenOptions; // Create a file if none exists, or append to an existing one let log = OpenOptions::new() .append(true) .open("server.log"); // Create a file, or fail if one with the specified name already exists let new_file = OpenOptions::new() .write(true) .create_new(true) .open("new_file.txt")?; }Just like with readers and writers, you can add a buffer to a
Fileif needed.Term The method-chaining pattern seen with
OpenOptionsis called a builder in Rust.Seeking
Files also implement theSeektrait, which means you can hop around within aFilerather than reading or writing in a single pass from the beginning to the end.
Seekis defined like this:#![allow(unused)] fn main() { pub trait Seek { fn seek(&mut self, pos: SeekFrom) -> io::Result<u64>; } pub enum SeekFrom { Start(u64), End(i64), Current(i64), } }Seeking within a file is slow.
Other Reader and Writer Types
Add notes about common types of readers and writers.
Handy Readers and Writers
The
std::iooffers a few function that return trivial readers and writers.
io::sink()No-op writer. All the write methods return
Okand the data is discarded.
io::empty()No-op reader. Reading always succeeds and returns end-of-input.
io::repeat(byte)Creates a reader that repeats the given byte endlessly.
Binary Data, Compression, and Serialization
Go here for some crate recommendations.
Files and Directories
OsStrandPathRust strings are always valid Unicode. Filenames are almost always Unicode.
To solve the Unicode issue, Rust provides
std::ffi::OsStrandstd::ffi::OsString.
std::ffi::OsStr
OsStris a string type that's a subset of UTF-8. It's sole purpose is to represent all filenames, CLI arguments, and environment variables on all systems.
std::path::Path
Pathis exactly likeOsStr, but it provides a bunch of handy filename-related methods.When to use which?
For absolute and relative paths, use
Path. For an individual component of a path, useOsStr.
Owning types
For each string type, there's always a corresponding owning type that owns heap-allocated data.
String type | Owning type | Conversion method --|--
str|String|.to_string()OsStr|OsString|.to_os_string()Path|PathBuf|.to_path_buf()All three of these string types implement a common trait,
AsRef<Path>, which makes it easy to declare a generic function that accepts "any filename type" as an argument.#![allow(unused)] fn main() { use std::path::Path; use std::io; fn open_file<P>(path_arg: P) -> io::Result<()> where P: AsRef<Path> { let path = path_arg.as_ref(); // ... } }
PathandPathBufMethodsPath Method
Path::new(str)#![allow(unused)] fn main() { fn new<S: AsRef<OsStr> + ?Sized>(s: &S) -> &Path }Converts a
&stror&OsStrto a&Path. The string doesn't get copied; the new&Pathpoints to the same bytes as the original argument.
Path Method
path.parent()#![allow(unused)] fn main() { fn parent(&self) -> Option<&Path> }Returns the path's parent directory, if any. The path doesn't get copied; the parent directory of
pathis always a substring ofpath.
Path Method
path.file_name()#![allow(unused)] fn main() { fn file_name(&self) -> Option<&OsStr> }Returns the last component of
path, if any.
Path Methods
path.is_absolute()andpath.is_relative()#![allow(unused)] fn main() { fn is_absolute(&self) -> bool fn is_relative(&self) -> bool }Tells you whether the path is absolute or relative.
Path Method
path1.join(path2)#![allow(unused)] fn main() { fn join<P: AsRef<Path>>(&self, path: P) -> PathBuf }Joins two paths. If
path2is an absolute path, it just returns a copy ofpath2.Use Case: The path
joinmethod can be used to turn any path into an absolute path.#![allow(unused)] fn main() { let abs_path = std::env::current_dir()?.join(any_path); }
Path Method
path.components()#![allow(unused)] fn main() { fn components(&self) -> Components }Creates an iterator over the components of the given path, from left to right. The
Itemtype of the iterator isstd::path::Component, which is an enum:#![allow(unused)] fn main() { pub enum Component<'a> { Prefix(PrefixComponent<'a>), RootDir, CurDir, ParentDir, Normal(&'a OsStr), } }Converting
Paths to StringsPath Method
path.to_str()#![allow(unused)] fn main() { fn to_str(&self) -> Option<&str> }If
pathisn't valid UTF-8, this method returnsNone.
Path Method
path.to_string_lossy()#![allow(unused)] fn main() { fn to_string_lossy(&self) -> Cow<str> }Basically the same as
to_str, but it'll always return a string regardless of whether or not the path is valid UTF-8. In the case the case that it's not valid, each invalid byte is replaced with the Unicode replacement character, �.Path Method
path.display()#![allow(unused)] fn main() { fn display(&self) -> Display }Doesn't return a string, but it implements
Displayso that it can be used withprint!macro and friends.Filesystem Access Functions
Reading Directories
To list the contents of a directory, use
std::fs::read_dir, or the.read_dir()method of aPath:#![allow(unused)] fn main() { use std::path; for entry_result in path.read_dir()? { let entry = entry_result?; println!("{}", entry.file_name().to_string_lossy()); } }The
read_dirmethod has the following type signature:#![allow(unused)] fn main() { fn read_dir<P: AsRef<Path>>(path: P) -> Result<ReadDir> }A
DirEntryis a struct with a few methods that have the following signatures:#![allow(unused)] fn main() { struct DirEntry(_); fn path(&self) -> PathBuf fn metadata(&self) -> Result<Metadata> fn file_type(&self) -> Result<FileType> fn file_name(&self) -> OsString }Platform-Specific Features
The
std::osmodule contains a bunch of platform-specific features, likesymlink.If you want code to compile on all platforms, with support for symbolic links on Unix, for instance, you must use
#[cfg]in the program as well. In such cases, it's easiest to importsymlinkon Unix, while defining asymlinkstub on other systems:#![allow(unused)] fn main() { #[cfg(unix)] use std::os::unix::fs::symlink; // Stub implementation of symlink for platforms that don't have it #[cfg(not(unix))] fn symlink<P: AsRef<Path>, Q: AsRef<Path>>(src: P, _dst: Q) -> std::io::Result<()> { Err(io::Error::new( io::ErrorKind::Other, format!("can't copy symbolic link {}", src.as_ref().display()) )) } }There's a
preludemodule that can be used to enable all Unix extensions at once:#![allow(unused)] fn main() { use std::os::unix::prelude::*; }Networking
For low-level networking code, start with the
std::netmodule.Go here for networking crate recommendations.
Concurrency
Macros
Unsafe Code
