Basic Types

Integer Types

u8 is used to represent single-byte values.

Characters are distinct from the numeric types (unlike C++); a char is neither a u8, nor an i8.

Values used as array access indices must be usize. The same applies to values that represent the size of arrays or vectors.

Integer literals can take a suffix indicating their type. The suffix can optionally be seperated by an underscore. eg:

  • 42u8 is a u8 value
  • 1729isize and 1729_isize are both isize

Compiler behavior: When infering a numeric type, the compiler will tend to favor inferring i32.

The following prefixes can be used with numeric literals to specify their radix:

  • 0x hexadecimal
  • 0o octal
  • 0b binary

Long numeric literals may be segmented by underscores for readability, eg: 4_295_923_000_010 or 0xffff_0f0f.

Rust provides byte literals, which are character-like literals for u8 values: b'X' represents the ASCII code for the character X, but as a u8 value.

You can convert from one integer type to another using the as (type-cast) operator: 65535_u16 as i32

Floating-Point Types

The fraction part of a floating-point type may consist of a lone decimal point: 5. is a valid float constant.

Compiler behavior: Given a floating-point number, the compiler will infer a type of f64.

The bool Type

bool values can be converted to i## types using the as operator:


#![allow(unused)]
fn main() {
assert_eq!(false as i32, 0);
assert_eq!(true as i32, 1);
}

But, the inverse is not true. The as operator can't convert numeric types to bool. You have to be more explicit by using a comparison: x != 0

Rust uses an entire byte for a bool value in memory, so you can create a pointer to it.

Characters

Rust's character type char represents a single Unicode character, as a 32-bit value.

chars represent a single character in isolation. Whereas strings and streams of text use UTF-8 encoded bytes. This means the String type represents a sequence of UTF-8 bytes, not chars.

A char literal is just a single Unicode character wrapped in single quotes, e.g. '©'.

The as operator can be used to convert char to an integer type (i32, u16, etc), but the opposite is only true for u8 types. For others, use std::char::from_(integer type).

Tuples

Tuple elements cannot be accessed using dynamic indices. That is to say, given tuple t, I can't use variable i to access the ith element.

Term: The type definition () is called the unit type.

Rust uses the unit type where there's no meaningful value to carry, but the context still demands us to define a type. e.g. a function that returns no value has a return type of ().

Shorthand: A function declaration whose return type is ommited is shorthand for returning the unit type. e.g. fn my_fn(); is shorthand for fn my_fn() -> ();.

Trailing commas are acceptable in tuples. They're acceptable pretty much anywhere in Rust.

Pointer Types

Pointers in Rust are much more performant and memory-efficient than they are in GCed languages.

References

& is the immutable reference operator. It creates the reference.

&mut is the mutable reference operator.

* is the dereference operator. It accesses the value being referred to.

The type &T is pronounced "ref T", meaning "reference to a value of type T".

The expression &x creates a reference to value x. In words, we'd say that it "borrows a references to x".

The expression *x (given that x is of type &T) refers to the value that x is a reference to.

References are immutable by default. For a reference to be mutable, it must have type &mut T.

Pointers in Rust can never be null. There are no pointer exceptions.

Boxes

Boxs are references whose referent is allocated directly in the heap.

When a Box is created, enough memory is allocated on the heap to contain its value:


#![allow(unused)]
fn main() {
let v = vec![1, 2, 3, 4];
let b = Box::new(v); // allocated space on the heap to hold v
}

When a Box reference goes out of scope, both itself and the value it refers to in the heap are freed.

Raw Pointers

Raw pointers are only used in unsafe code.


Arrays, Vectors, and Slices

Rust has 3 types for representing a sequence of values.

NameTypeDescriptionSizeMemory
Array[T; N]Array of N values, each of type TFixedStack
VectorVec<T>Vector of TsDynamicHeap
Slice&[T]Shared slice of TsFixedStack (as pointer to heap value)

Given any of the above types as value v, the expression v.len() gives the number of elements in v, and v[i] refers to the i'th element of v. i must be of type usize; no other integer types will work as an index.

Arrays

An array's length is built into its type and is fixed at compile time.

Implicit behavior: When working with an array value and accessing its methods, Rust implicitly converts a reference to an array to a slice. So if you need to know the methods for an array, go look at the methods for slices.

Vectors

A vector is allocated on the heap.

There are 5 main ways to create a vector:

  1. Use the vec! macro (simplest)
  2. Build a vector by repeating a given value a certain number of times using a syntax that imitates array literals:

#![allow(unused)]
fn main() {
let rows = 100;
let cols = 100;
let pixel_buffer = vec![0; rows * cols];
println!("Buffer is {} bytes long.", pixel_buffer.len())
}
  1. Using Vec::new to create a new, empty vector, and pushing elements onto it.

#![allow(unused)]
fn main() {
let mut v = Vec::new();
v.push("hello");
v.push("vector");
println!("{:?}", v);
println!("capacity: {}", v.capacity());
}
  1. Iterators produce vectors when executed (using their .collect() method):

#![allow(unused)]
fn main() {
let v: Vec<i32> = (1..4).collect();
assert_eq!(v, [1, 2, 3]);
}
  1. If you know the size of the vector in advance, you can use Vec::with_capacity to create the vector, instead of new:

#![allow(unused)]
fn main() {
let mut v = Vec::with_capacity();
v.push("hello");
v.push("vector");
println!("{:?}", v);
}

Using Vec::with_capacity instead of Vec::new is more performant because it can prevent costly heap reallocations when a vector grows beyond its current capacity.

A vector's capacity() method returns the number of elements the vector could hold without reallocation.


#![allow(unused)]
fn main() {
// Track the length and capacity of a vector as values are added to it
let mut v: Vec<i32> = Vec::with_capacity(2);
println!("length/capacity: {}/{}", v.len(), v.capacity());
v.push(1);
v.push(2);
println!("length/capacity: {}/{}", v.len(), v.capacity());
v.push(3);
println!("length/capacity: {}/{}", v.len(), v.capacity());
}

As with arrays, slice methods can be used on vectors.

In stack memory, a Vec<T> consists of three values:

Stack cellStack cellStack cell
Pointer to heap-allocated bufferThe capacity of the bufferThe current occupied size of the buffer

Inserting and removing vectors vectors from anywhere but the end of a vector is expensive.

Slices

A slice, written [T] (without specifying the length), is a region of an array or vector.

Since a slice can be any length, they can't be stored directly in variables or passed as function arguments; they are always passed by reference.

A reference to a slice is a fat pointer.

Term: A fat pointer is a two-word value on the stack comprised of

  1. A pointer to the slice's first element
  2. The number of elements in the slice

Whereas an ordinary reference is a non-owning pointer to a single value, a reference to a slice is a non-owning pointer to several values.

A slice is (maybe?) a psuedo-generic for any sequential data type.

You can get a reference to a slice of an array, vector, or another slice by indexing it with a range:


#![allow(unused)]
fn main() {
let v: Vec<f64> = vec![1., 2., 3.];
// println!()
}

The term slice is often used for reference types like &[T] or &str, but that's just shorthand. Those types are called references to slices.

String Types

String Literals

String literals are enclosed in double quotes.

Term: Rust offers raw strings that don't require backslashes or explicit inclusion of whitespace. They're similar to template string in Javascript.


#![allow(unused)]
fn main() {
let paragraph = r#"
I'm just a regular paragraph
with the appropriate spacing.
"#;
println!("{}", paragraph);
}

Byte Strings

A string literal with the b prefix is a byte string. A byte string is a slice of u8 values (rather than Unicode text).

Strings in Memory

Rust strings are stored in memory using UTF-8 (not as arrays of chars).

A String is stored on the heap as a resizable buffer of UTF-8 text. You can think of a String as a Vec<u8> that is guaranteed to hold well-formed UTF-8.

Pronounciation: A &str is called a "stir" or "string slice".

A &str is a reference to a sequence of UTF-8 text owned by someone else.

A &str is a slice, so it is therefore a fat pointer. You can think of a &str as being nothing more than a &[u8] that is guaranteed to hold well-formed UTF-8.

A string literal is a &str that refers to preallocated text stored in a read-only memory.

Any string type's length (returned by .len()) is measured in bytes, not characters.

It is impossible to modify a &str:


#![allow(unused)]
fn main() {
let mut s = "hello";
s[0] = 'c'; // &strs cannot be mutably indexed
}

String

Ways to create a String:

  • Given a &str, the .to_string() method will copy it into a String.
  • The format!() macro works just like println!(), except that it returns a new String instead of writing text to stdout, nor does it automatically add a newline at the end.
  • Arrays, slices, and vectors of strings have two methods that form a new String from many strings:
    1. .concat()
    2. .join(sep)

#![allow(unused)]
fn main() {
let elves = vec!["snap", "crackle", "pop"];
println!("{:?}", elves.concat());
println!("{:?}", elves.join(", "));
}

A &str can refer to both a string literal or a String, so it's the most appropriate for function arguments when the caller should be allowed to pass either kind of string.

Unlike other languages, Rust strings are strictly Unicode only. This means that they're not always the appropriate choice for string-like data. Here are some situations where they're not the correct choice:

When you haveUse
Unicode textString or &str
Filenamestd::path::PathBuf and &Path
Binary dataVec<u8> and &[u8]
Environment variablesOsString and &OsStr
Strings from a FFIstd::ffi::CString and &CStr