Rust & “Safe” Programming

Ownership

The fundamental problem with manual memory management (like C and C++): it's often not clear whose responsibility it is to free the heap variables. If they are referenced in a data structure, will it free them in its destructor? Do we have to keep a reference to free them?

In other words, who owns the memory? If we had a rule about who owns the variables, we would know who must delete them: the owner.

Ownership

In C++, the unique_ptr enforces ownership: whatever code holds the unique_ptr is responsible for that heap memory, and it's unique by design.

So, C++ allows us to enforce ownership of heap memory, but doesn't force us.

Ownership

Garbage collectors (and reference counting, and shared_ptr) side-step the problem by detecting memory that is no longer being used at runtime. These all have a small but non-zero cost.

That's often okay: garbage collectors are fast, and most code isn't that time-sensitive.

Ownership

What if a language forced us to keep track of ownership, and kept us safe that way? There would be no runtime cost (from garbage collecting or reference counting) and no chance of losing all references to heap memory.

Rust

The Rust programming language was designed with ownership

We will be using it as an example of (1) another language where we can apply the concepts from the previous section of the course, and (2) how a language can be designed to keep us safe (in ways we will see as we explore).

Rust

Rust is much closer to languages you already know than Haskell. Rust is…

imperative;
statically typed, with types (often) explicitly declared;
typically compiled to machine code.

Additional resource: The Rust Book, a free online language intro.

Rust

A Rust hello world:

fn main() {
    println!("Hello, world!");
}

If saved as hello.rs, it can be compiled and run:

rustc hello.rs
./hello

Rust

Or we can use Rust's build system Cargo. In your project directory, create a config file Cargo.toml (can be created with cargo init), a directory src and main.rs:

Cargo.toml
src
└── main.rs

But then, simply:

cargo run

Rust

One first tool: the println! macro. It prints a line of text, using {} to mark places for values to be inserted, and {:?} for more debugging-style output.

println!("Hello, world!");
println!("The result: {}", 3 + 4);
println!("A {} B {} C", 18 / 2, "test");
println!("A {:?} B {:?} C", 18 / 2, "test");

Hello, world!
The result: 7
A 9 B test C
A 9 B "test" C

Also note: semicolons at the end of statements.

Variables & Types

Rust has variables, as you'd expect in an imperative language, with type inference when declaring and initializing.

let count = 12;
let length = 12.345;
let message = "Hello world";

This creates three variables: an integer, a floating point, and a string.

Variables & Types

We can also be explicit about the type when creating a variable:

let count_int: i64 = 12;
let count_byte: u8 = 12;
let length_dbl: f64 = 12.345;
let length_sgl: f32 = 12.345;
let message: &str = "Hello world";
let another_int: i64;
another_int = -1234;

It's idiomatic in Rust to not explicitly give the type unless it's necessary to disambiguate.

[Note: another_int is created here without being initialized.]

Variables & Types

Basic types in Rust are explicit about their size (in bits): i8, i16, i32, i64, i128 for signed integers; u8, u16, u32, u64, u128 for unsigned; f32, f64 for floating-point.

There are also integer types that are the size of a memory address on the system architecture: isize and usize. These types are used where you are referring to stuff in memory (e.g. array sizes or indexing).

Variables & Types

Rust characters are single-quoted and are Unicode scalar values (≈ Unicode code points ≈ Unicode characters).

let letter = 'A';
let emoji = '👍';
let another_letter: char = 'B';

Variables & Types

Also, booleans:

let a = true;
let b: bool = false;
let c = a && b; // logical and
let d = a || b; // logical or
let e = !a; // negation

Compound Types

Tuples in Rust are a way to wrap up multiple values in a single variable/argument/etc.

let vals1 = (80, true);
let vals2: (i64, bool) = (120, false);
let vals3 = (80, 1.234);
let vals4 = ("Hello world", 123, 4.5678);

Compound Types

let vals1 = (80, true);
let vals2: (i64, bool) = (120, false);
let vals3 = (80, 1.234);
let vals4 = ("Hello world", 123, 4.5678);

Both vals1 and vals2 have the same type here ((i64,bool)), but vals3 has a different type ((i64,f64)).

So, you can compare vals1 == vals2 but vals1 == vals3 won't compile.

Basically, the same rules as Haskell.

Compound Types

If we need to get at elements of a tuple, we index with a . like accessing a property in most languages:

let vals = (80, true);
println!("first {}", vals.0); // first 80
println!("second {}", vals.1); // second true

… but you should probably pattern-match your way in instead (more later).

Compound Types

Arrays contain multiple values of the same type, but unlike most languages the size of the array is part of the type. That is, these have the same type: array of four f64 (written [f64; 4]):

let lengths = [1.2, 3.4, 5.6, 7.8];
let widths = [12.0, 34.0, 56.0, 78.0];

So, something like lengths == widths makes sense (and returns false because the elements are different).

Compound Types

But these have different types:

let shorter = [1, 2, 3];
let longer = [4, 5, 6, 7, 8, 9];

Or equivalently:

let shorter: [i64; 3] = [1, 2, 3];
let longer: [i64; 6] = [4, 5, 6, 7, 8, 9];

The number of elements of an array is fixed and must be known at compile-time. (But slices and vectors exist: more later.)

Compound Types

Indexing array elements is with [], as you might expect:

let lengths = [1.2, 3.4, 5.6, 7.8];
let n: usize = 2;
println!("{} {}", lengths[0], lengths[n]); // 1.2 5.6

Array indicies must be type usize: an unsigned integer big enough to refer to a memory location in this architecture.

Compound Types

Aside: the dbg! macro will produce easy debugging-friendly output of a value:

dbg!(lengths);

[src/demo.rs:24:5] lengths = [
    1.2,
    3.4,
    5.6,
    7.8,
]

Mutability

But variables aren't quite like variables in other languages. They can't vary (yet):

let value: i64;
value = 1;
value = 2;

The first two lines here are fine, but the last one causes a compilation error:

cannot assign twice to immutable variable `value`

Mutability

Variables in Rust are immutable by default: you are allowed to assign to them only once. (In the initializer or later, but only once.)

A variable must be declared mutable if it will be changed after initial assignment:

let mut value: i64;
value = 1;
value = 2;
println!("{}", value); // prints 2

Mutability

If we declare a variable mut but don't assign to it multiple times, the compiler warns us:

warning: variable does not need to be mutable

Maybe this is a first hint: the Rust compiler is very careful when handling our code, and very opinionated what what we should/shouldn't do.

Mutability

Note: the concept of mutable/immutable here is a little different than discussed earlier in the course. In most languages, the distinction applies to a type. e.g. in Python tuples ((1, 2, 3)) are immutable and lists ([1, 2, 3]) are mutable.

In Rust, the words are applied to individual variables regardless of their type.

Mutability

Rust also has the concept of constants:

const MESSAGE: &str = "Hello";
let input = get_user_input();

A constant must be known at compile time. An immutable variable can be determined as the program runs, but we promise it won't change after initial assignment.

[Note: these are probably different types. MESSAGE is a &str but input will have to be a String. More later.]

Control Flow

Basic control flow structures are similar to other imperative languages. The if statement is spelled like C/Java/C#, but without the parens around the condition:

let count = 3;
if count < 10 {
    println!("A small number");
} else if count < 100 {
    println!("A large number");
} else {
    println!("A vary large number");
}

Control Flow

Similarly, the while loop is written like C/Java/C#, without the parens:

let mut i = 1;
while i < 100 {
    i *= 2;
}
println!("{}", i);

Control Flow

The for loop is primarily about iterating through a collection (technically, anything that implements the Iterator or IntoIterator trait: more later).

let values = [4, 5, 6, 7];
for v in values.iter() {
    println!("v is {}", v);
}

Output:

v is 4
v is 5
v is 6
v is 7

Control Flow

But if you want a counting for loop, there's a convenient way to create a range:

for i in 1..4 {
    println!("i is {}", i);
}

Output:

i is 1
i is 2
i is 3

Note: includes the range start, but not the end (i.e. not 4 here).

Control Flow

But there is an inclusive range, inserting an =:

for i in 1..=4 {
    println!("i is {}", i);
}

Output:

i is 1
i is 2
i is 3
i is 4

More Expressions

Rust makes a subtle distinction that isn't always obvious in other languages: statements and expressions are different. Statements end with a semicolon; expressions do not.

This usage of if contained expressions (in the {…}), and was therefore behaved like an expression.

if count < 10 {
    println!("A small number");
} else if count < 100 {
    println!("A large number");
} else {
    println!("A vary large number");
}

More Expressions

But an if in Rust is an expression that can return a result: just make the {…} contents expressions instead of statements.

let result: i64;
result = if count < 10 { 100 } else { 10 * 100 };

More Expressions

Or the content of the {…} can be more than a simple expression, as long as it ends with an expression (i.e. no semicolon) that is the result.

result = if count < 10 {
    100
} else {
    let x = 10;
    x * 100
};

More Expressions

In Rust, anywhere you need an expression, you can use a code block that has an expression (no semicolon) as the last thing it does:

result = {
    let x = 10;
    x * 100
} + 10;

Functions

We have already seen the main function, but we're going to need more.

Functions are defined with the fn keyword. Argument and return types must be given explicitly. Local variables as-expected.

fn add_three(x: i64, y: i64, z: i64) -> i64 {
    let result = x + y + z;
    return result;
}

Functions

But we don't have to return explicitly. If the last thing in a function is an expression (i.e. calculation with no trailing ;), that will be the returned. This is equivalent, and more idiomatic:

fn add_three_2(x: i64, y: i64, z: i64) -> i64 {
    x + y + z
}

Functions

We can also write anonymous functions as well. They are called closures in the Rust docs, and do function as closures. The syntax is like:

|m| m%n==0
|a, b| a + b

Vectors

A quick aside, so we have another type to work with…

The Rust Vec type is similar to an array, but can be modified in-place: have elements added, removed.

The Vec type also provides a lot of convenient methods to manipulate it.

Vectors

An empty vector can be created with Vec::new, or we can convert an array (or other types) to a vector with Vec::from. These have a same results: creating a vector of u8 values 1, 2, 3.

let mut values = Vec::new();
values.push(1u8);
values.push(2u8);
values.push(3);

let mut values: Vec<u8> = Vec::from([1, 2]);
values.push(3);

Vectors

Some new language syntax we just saw: types can contain static methods accessed by ::, and instances can contain methods accessed by .:

let mut values = Vec::new();
values.push(1u8);
values.push(2u8);
values.push(3);

The Rust convention is to have a static method Type::new as a constructor (i.e. function that returns a new instance of the type).

Vectors

Some more useful things a Vec can do:

println!("{}", values[0]); // indexing
println!("{}", values == Vec::from([1, 2])); // comparison
for v in values.iter() {
    // iteration
    println!("{}", v);
}
values.insert(0, 10); // insert value
println!("{:?}", values); // debug-format printing

1
false
1
2
3
[10, 1, 2, 3]

References

As soon as we have functions, we have a question about large function arguments.

We need to be able to pass large data structures (e.g. array/vector of millions of elements) as an argument. We could give the function a copy of the argument (i.e. pass by value) but making the copy would be expensive so we'd like to avoid it if possible.

References

In many languages, you can also create a reference to the data structure and give that to the function (i.e. pass by reference). A reference is more-or-less just a pointer, so it's fairly cheap to create and pass around.

But in most languages, any code with a reference can follow it and modify the value. We don't want functions to do that without us knowing, so there's a little danger to handing out references.

References

Rust (like other languages) has the concept of references. The & is used to create a reference to a variable, and used to annotate the type to mean a reference to. Here, n2, n3 are type &i32 and are references to n1.

let n1 = 1234; // literal i32 value 1234
let n2: &i32 = &n1; // a &i32 referring to n1
let n3 = &n1; // a &i32 referring to n1
let n4 = n1; // another i32 variable containing a copy of 1234
println!("{} {} {} {}", n1, *n2, *n3, n4);

All are 1234 when printed. Note dereferencing with *.

References

Here is similar C++ code:

int n1 = 1234;
int* n2 = &n1;
int* n3 = &n1;
int n4 = n1;
cout << n1 << *n2 << *n3 << n4 << endl;

We can follow any reference to modify the value:

*n3 = 4567;
cout << n1 << *n2 << *n3 << n4 << endl;

Here, n4 is still 1234, but the rest are 4567.

References

We can try something similar in Rust:

let mut n1 = 1234_i64;
let n2 = &n1;
n1 = 4567;

This fails to compile with this error on line 3:

cannot assign to `n1` because it is borrowed

Borrowing

In Rust, every value is owned in a specific way. Whoever owns the value can read/write it (with some details to come later). The compiler enforces this to ensure our memory is accessed in a safe way.

In this case: creating a reference to a value temporarily gives responsibility for that value to the reference. The reference borrows the value or borrows ownership of the value.

Borrowing

You can do one of: use the original value; have several immutable references to it; have one mutable reference. This works:

let mut n1 = 1234;
let n2 = &mut n1; // note: a mutable reference
*n2 = 4567;
println!("{}", *n2);
//println!("{} {}", n1, n2); // "cannot borrow `n1` as immutable because it is also borrowed as mutable"

Borrowing

The word borrow here is appropriate: when a reference is finished with a value, it's given back. This also works:

let mut n1 = 1234;
{
    let n2 = &mut n1;
    *n2 = 4567;
    println!("{}", *n2); // prints 4567
} // n2 is out of scope here, so no longer has a reference
n1 = 7890;
println!("{}", n1); // prints 7890

At each point in the code, exactly one variable had mutable access of the value.

Borrowing

Multiple immutable references worked above, but multiple mutable references fail:

let mut n1 = 1234;
let n2 = &mut n1;
let n3 = &mut n1;

Compilation fails on line 3:

cannot borrow `n1` as mutable more than once at a time

Ownership

The idea of ownership applies to values passed to functions as well. When a value is passed to a function, ownership is given permanently to the function. Or, ownership moves to the function: is transferred permanently.

Ownership

Let's work with vectors, which have more interesting operations we can test. We can easily add up some values, using tricks from the functional world:

fn sum_vec(values: Vec<i64>) -> i64 {
    values.iter().fold(0, |a, b| a + b)
}

Note: |a, b| a + b is an anonymous two argument function that returns a+b. We can use this:

let numbers = Vec::from([1, 2, 3, 4]);
println!("{}", sum_vec(numbers)); // prints 10

Ownership

But one line later:

let numbers = Vec::from([1, 2, 3, 4]);
println!("{}", sum_vec(numbers)); // prints 10
println!("{:?}", numbers);

On the third line:

borrow of moved value: `numbers`

Our main code no longer owns the Vec: it was given to the function, and it's not ours (to lend to println!) anymore.

Ownership

But references will get us out of the problem. We can (1) lend the Vec to a reference, (2) move the reference to the function so when the function exits, (3) the reference is destroyed, and (4) ownership returns to the main code.

Let's try again…

Ownership

Only changes: argument changed from Vec to &Vec, and argument became a reference: &numbers.

fn sum_vec(values: &Vec<i64>) -> i64 {
    values.iter().fold(0, |a, b| a + b)
}

let numbers = Vec::from([1, 2, 3, 4]);
println!("{}", sum_vec(&numbers)); // prints 10
println!("{:?}", numbers); // prints [1, 2, 3, 4]

Now ownership stays with the main code, except while sum_vec is running (and the main code is waiting). The compiler proves to itself that everything is okay, so we can do this.

Ownership

The same ownership rules apply to mutable reference arguments: only one mutable reference may exist at a time.

A function can ask for a mutable reference:

fn append_sum(values: &mut Vec<i64>) {
    let sum = values.iter().fold(0, |a, b| a + b);
    values.push(sum);
}

Ownership

Now we must give a mutable reference: if the variable isn't mut or the reference isn't &mut, this won't compile.

let mut numbers = Vec::from([1, 2, 3, 4]);
append_sum(&mut numbers);
println!("{:?}", numbers); // prints [1, 2, 3, 4, 10]

There is no question in Rust whether or not a function can modify its argument. If the value or an immutable reference is passed: no. If a mutable reference is passed: yes.

Ownership Rules

Rust has these rules around ownership, from The Rust Book:

Each value in Rust has a variable that’s called its owner.
There can only be one owner at a time.
When the owner goes out of scope, the value will be dropped.

Ownership Rules

You can return a value from a function: you give ownership away. You can pass a value to a function: you give the function ownership.

There are no worries about freeing memory: when the owner is destroyed, Rust knows (at compile-time) to free the value.

Ownership Rules

And these rules for borrowing, also from The Rust Book:

At any given time, you can have either one mutable reference or any number of immutable references.
References must always be valid.

Ownership Rules

If the Rust compiler can't prove the reference will always be valid, the code won't compile.

Rust keeps us safe from multiple mutable aliases, but still allows temporary borrowing into a function. Copying references is cheap, but then borrowing rules come into play.

We can be sure of which code modifies values: only if it has a mutable reference.

Moving and Copying

In the discussion of references and ownership, there was an omission about what gets moved.

Suppose we have some simple functions that work on integers and vectors:

fn print_int(n: i64) {
    println!("{:?}", n);
}
fn print_vec(v: Vec<i64>) {
    println!("{:?}", v);
}

Note: arguments are values, not references.

Moving and Copying

And we use them:

let n: i64 = 7;
let v: Vec<i64> = Vec::from([1, 2, 3, 4]);
println!("n: {:?}", n);
println!("v: {:?}", v);
print_int(n);
print_vec(v);
println!("n: {:?}", n);
//println!("v: {:?}", v); // "borrow of moved value: `v`"

Why is ownership of a Vec moved, but an i64 not?

Moving and Copying

Rust makes a distinction between types that can be easily copied and ones that can't.

The idea: if copying them is inexpensive, we don't have to worry about ownership and just let the language make the copy automatically when needed *.

Moving and Copying

Specifically, types that implement the Copy trait are implicitly copied when they are passed around like this. (Trait in Rust ≈ typeclass in Haskell or interfaces in other languages: more soon.)

Moving and Copying

Non-Copy types have their ownership given away when they are assigned to a different variable (or function argument).

The simple scalar types (like i64) are Copy. Vectors are not.

Moving and Copying

The same happens for variable assignment:

let n1: i64 = 7;
let v1: Vec<i64> = Vec::from([1, 2, 3, 4]);
let n2 = n1; // implicitly a copy, because i64 is Copy
let v2 = v1; // implicitly a move, because Vec is not Copy
println!("n1: {:?}", n1);
println!("n2: {:?}", n2);
//println!("v1: {:?}", v1); // "borrow of moved value: `v1`"
println!("v2: {:?}", v2);

Moving and Copying

Copy is implemented for values that can be duplicated by duplicating the bits they have on the stack.

The Vec keeps most of its data on the heap: copying its stack value would create multiple references to that heap memory.

Cloning

If you really want to duplicate a value, there's another trait for that: Clone. It guarantees a .clone() method that safely duplicates a value.

We have to call .clone() explicitly if we want it to happen: copying is implicit but cloning is not, because cloning might be expensive. The developer is forced to acknowledge that it's happening.

Cloning

Finally, we can use a value in multiple places if we really need to.

let v1: Vec<i64> = Vec::from([1, 2, 3, 4]);
let v2 = v1.clone(); // a full copy of v1
print_vec(v1.clone());
println!("v1: {:?}", v1);
println!("v2: {:?}", v2);

[For print_vec, borrowing through a non-mut reference would make much more sense.]

Structs

A struct in Rust can be used to group multiple values together (like a tuple). Each field in the struct gets named.

#[derive(Debug, Clone)]
struct GeoPoint {
    pub lat: f64,
    pub lon: f64,
    pub ele: i32,
}

This create a struct GeoPoint that has three fields (lat, lon, ele), each with its own type.

Structs

#[derive(Debug, Clone)]
struct GeoPoint {
    pub lat: f64,
    pub lon: f64,
    pub ele: i32,
}

All of these fields are public (pub): they can be accessed from outside code.

The #[derive…] line gets us free implementations of Debug (printing with {:?}) and Clone (a .clone() method).

Structs

We can now create an instance of the struct and work with it.

let p = GeoPoint {
    lat: 49.267,
    lon: -122.967,
    ele: 68,
};
println!("{:?}", p);

GeoPoint { lat: 49.267, lon: -122.967, ele: 68 }

Structs

Like any other variable, structs can be mutable or not.

let mut p = GeoPoint {
    lat: 49.267,
    lon: -122.967,
    ele: 68,
};
p.ele = 168;
println!("{:?}", p);

GeoPoint { lat: 49.267, lon: -122.967, ele: 168 }

Without mut, setting p.ele would fail.

Structs

We can define methods in structs (or other types we define):

impl GeoPoint {
    fn antipode(&self) -> GeoPoint {
        GeoPoint {
            lat: (self.lat + 180.0) % 360.0,
            lon: (self.lon + 180.0) % 360.0,
            ele: self.ele,
        }
    }
}

This creates a method .antipode that returns a new GeoPoint.

Structs

Then this works like we would expect a method to work:

let p = GeoPoint {
    lat: 49.267,
    lon: -122.967,
    ele: 68,
};
println!("{:?}", p.antipode());

GeoPoint { lat: 229.267, lon: 57.033, ele: 68 }

Structs

The self argument is assumed to be the type of the struct we're working with, and given as a short-form. We can be explicit if we want, and these two are equivalent:

fn antipode(&self) -> GeoPoint {…}

fn antipode(self: &GeoPoint) -> GeoPoint {…}

Structs

The argument name self is special: it's the receiver argument. When it's included, the function becomes a method.

The receiver argument can be the value itself (self) or a reference (&self) or a mutable reference (&mut self). *

* or a few other variants of references to self.

Structs

Methods can change the struct (if they have a mutable reference) and take additional arguments.

impl GeoPoint {
    fn go_up(&mut self, height: i32) {
        self.ele += height;
    }
}

With p defined as mut,

println!("{}", p.ele); // possible because .ele is pub
p.go_up(32);
println!("{}", p.ele);

68
100

Structs

If we implement a function on a type that does not have a receiver argument (self), it is an associated function (≈ static method). It's conventional (but not required) to have a new function as a che onstructor:

impl GeoPoint {
    fn new(lat: f64, lon: f64) -> GeoPoint {
        GeoPoint {
            lat: lat,
            lon: lon,
            ele: 0,
        }
    }
}

Structs

Associated functions are accessed from the type with ::

let p = GeoPoint::new(49.267, -122.967);
println!("{:?}", p);

GeoPoint { lat: 49.267, lon: -122.967, ele: 0 }

So TypeName::foo for an associated function; instance.bar for a method.

Structs

There is a shortcut when creating structs: if we have local variables with names that match the fields, we can use them directly. So our ::new could equivalently be:

fn new(lat: f64, lon: f64) -> GeoPoint {
    GeoPoint { lat, lon, ele: 0 }
}

Structs

Note that the self argument is like any other in the way it handles ownership. If you take self (not a reference: the actual value), ownership moves to the method.

fn consume(self) {
    println!("The GeoPoint is mine now!");
}

This fails since GeoPoint is not Copy:

let p = GeoPoint::new(49.267, -122.967);
p.consume();
println!("{}", p); // borrow of moved value: `p`

Structs

There are no classes in Rust: we get structs.

Also, there's no concept of inheritance on structs. We can create a new struct that uses another, but not inherit from it (like C++, Java, Python, etc).

Traits

But we do get traits. A trait defines a set of things a type must have (methods, etc). It is very much like a typeclass in Haskell or interface in Java, C#.

Our structs (or other types we define) can implement traits that are defined by the Rust standard library, or we can create our own traits.

Rust uses traits heavily to describe various aspects of the language.

Traits

We have seen the Copy and Clone as traits. Copy is special because of the way it affects ownership moves vs duplication. But Clone is a more normal trait: it guarantees that the .clone() method exists.

Traits

We have also implicitly seen the traits Display and Debug traits. They have made it possible to print things:

"{}" uses the Display trait's .fmt method.
"{:?}" uses the Debug trait's .fmt method.

Most things implement Debug, which is supposed to be programmer-friendly output. If you want to be printable in a user-friendly way, that's Display.

Traits

For our GeoPoint, we got a Debug implementation from the line #[derive(Debug, Clone)]: it calls a macro that builds a reasonable Debug::fmt.

We can implement Display to make it print nicely:

use std::fmt;
impl fmt::Display for GeoPoint {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}°N {}°E @ {}m", self.lat, self.lon, self.ele)
    }
}

Traits

impl fmt::Display for GeoPoint {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}°N {}°E @ {}m", self.lat, self.lon, self.ele)
    }
}

impl fmt::Display for GeoPoint {…}: this block implements fmt::Display for GeoPoint.
The type signature of fmt must be this: that's part of the trait.
write! works like println!, except the result is returned as a fmt::Result.

Traits

Or nicer if I can use >1 line:

fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
    write!(
        f,
        "{}°{} {}°{} @ {}m",
        self.lat.abs(),
        if self.lat < 0.0 { 'S' } else { 'N' },
        self.lon.abs(),
        if self.lon < 0.0 { 'W' } else { 'E' },
        self.ele
    )
}

Traits

Now we can do this:

let p = GeoPoint::new(49.267, -122.967);
println!("{:?}", p);
println!("{}", p);
let s = format!("build a string {} !", p);
println!("{}", s);

… and get this output:

GeoPoint2 { lat: 49.267, lon: -122.967, ele: 0 }
49.267°N 122.967°W @ 0m
build a string 49.267°N -122.967°E @ 0m !

Traits

The code for fmt did this:

write!(f, "{}°N {}°E @ {}m", self.lat, self.lon, self.ele)

Formatting with {} worked, so the f64s and i64 must implement Display. How do we know? The docs.

Traits

Another simple trait from the Rust standard library: Default. Types that implement Default can construct themselves with a default value:

impl Default for GeoPoint {
    fn default() -> GeoPoint {
        GeoPoint {
            lat: 0.0,
            lon: 0.0,
            ele: 0,
        }
    }
}

Traits

Or we can define and implement our own traits.

pub trait Upable {
    fn move_up(&mut self, height: i32);
    fn copy_up(&self, height: i32) -> Self;
}
impl Upable for GeoPoint {
    fn move_up(&mut self, height: i32) {
        self.ele += height;
    }
    fn copy_up(&self, height: i32) -> GeoPoint {
        let mut dup = (*self).clone();
        dup.move_up(height);
        dup
    }
}

Traits

Using those:

let mut p = GeoPoint::default();
println!("{:?}", p);
p.move_up(3);
println!("{:?}", p);
println!("{:?}", p.copy_up(3));

GeoPoint { lat: 0.0, lon: 0.0, ele: 0 }
GeoPoint { lat: 0.0, lon: 0.0, ele: 3 }
GeoPoint { lat: 0.0, lon: 0.0, ele: 6 }

Generic Types

We have already seen generic types in Rust: a Vec can hold any type: we can have a Vec<i64> or Vec<String> or Vec<Vec<i64>> and they are all distinct types.

Vectors are generic type, and take a type parameter that indicates the type they hold.

Generic Types

When instantiating a generic type, we need to specify the type parameter somehow: by giving it explicitly, or it can be implied by the initializer, or type-inferred by they usage.

All of these create a Vec<i64>:

let v1: Vec<i64> = Vec::new(); // explicit type
let v2 = Vec::<i64>::new(); // type parameter in initializer
let v3 = returns_a_vec_i64(); // implied by initializer
let mut v4 = Vec::new(); // type-inferred in next line...
v4.push(1234_i64);

Generic Types

We can create our own generic functions:

fn pair_with_one<T>(a: T) -> (T, i64) {
    (a, 1)
}

The <T> indicates that T is a type parameter in this definition. Like with Haskell, we expect the types to work themselves out as specified:

let pair1: (&str, i64) = pair_with_one("hello");
let pair2: (f64, i64) = pair_with_one(1.2345);

Generic Types

fn pair_with_one<T>(a: T) -> (T, i64) {
    (a, 1)
}

In this code, we know a is a function argument. The <T> works like an argument to the definition of the function: because the <T> is there, there's a type T that can be used for the rest of the code block.

Generic Types

Or a generic type with a type parameter:

#[derive(Debug)]
struct TwoVecs<T> {
    pub first: Vec<T>,
    pub second: Vec<T>,
}
impl<T> Default for TwoVecs<T> {
    fn default() -> TwoVecs<T> {
        TwoVecs {
            first: Vec::default(), // type-inferred as Vec<T>
            second: Vec::default(),
        }
    }
}

Generic Types

Our type is used similarly to Vec: specify the type parameter somehow, but holds any type we want.

let mut tv = TwoVecs::<i32>::default();
let mut tv = TwoVecs::default();
tv.first.push(1234);
println!("{:?}", tv); // TwoVecs { first: [1234], second: [] }

Generic Types

We quickly hit a problem similar to what we saw in Haskell:

fn is_larger<T>(a: T, b: T) -> bool {
    a > b
}

This fails to compile: we have no guarantee that > is defined on this type.

binary operation `>` cannot be applied to type `T`

Generic Types

The solution is to specify a trait that we demand the type implement: there are several special traits that map to operators in the language and Ord defines >.

fn is_larger<T: Ord>(a: T, b: T) -> bool {
    a > b
}

This function now takes two arguments (1) of the same type, (2) if that type implements Ord.

println!("{}", is_larger(2, 3));
println!("{}", is_larger("two", "one"));

Generic Types

This would fail with the trait bound `GeoPoint: Ord` is not satisfied because GeoPoint does not implement Ord.

is_larger(GeoPoint::Default(), GeoPoint::new(0.0, 0.0))

Core Traits

Many core ideas in Rust are implemented through traits that let a programmer access those features. If you want to know about a Rust type, once of the critical things to ask: what traits does it implement?

We have already seen Copy which has a huge difference in the way a type behaves with respect to moving/ownership.

Core Traits

For example, what can Vec do (as described by the docs)?

All of the methods described in the docs.
All of the methods implied by the trait implementations.
All of the Methods from Deref<Target = [T]> because we can do *vec to get the underlying array (slice).

Core Traits

Vec implements Ord and IntoIterator, so we expect this to work:

let vec = Vec::from([1, 2, 3]);
println!("{}", vec <= vec);
for e in vec.into_iter() {
    println!("{}", e);
}

The for loop actually works on anything that implements IntoIterator: we could have omitted the .into_iter() and it would be implicitly inserted.

Core Traits

Because so much of the language is accessible by the standard library traits, we can make our own types behave like the built-in types in many ways.

For example, the std::ops and std::cmp traits can let us overload basic operators.

Core Traits

We can make our GeoPoint work with + (i.e. operator overloading) by implementing Add:

use std::ops::Add;
impl Add for GeoPoint {
    type Output = GeoPoint;
    fn add(self, other: GeoPoint) -> GeoPoint {
        GeoPoint {
            lat: self.lat + other.lat,
            lon: self.lon + other.lon,
            ele: self.ele + other.ele,
        }
    }
}

Core Traits

Then we can add GeoPoints:

let p = GeoPoint::new(49.267, -122.967);
let mut offset = GeoPoint {
    lat: 1.0,
    lon: -1.0,
    ele: 10,
};
println!("{:?}", p + offset);

Compare AddAssign that corresponds to +=.

Core Traits

The == operator maps to the PartialEq trait. It has a type parameter: what type is the right-hand-side of the comparison.

impl PartialEq<GeoPoint> for GeoPoint {
    fn eq(&self, other: &GeoPoint) -> bool {
        self.lat == other.lat && self.lon == other.lon
    }
}

Then:

let p = GeoPoint::new(49.267, -122.967);
println!("{}", p == p); // true
println!("{}", p == GeoPoint::default()); // false

Core Traits

The right-hand-side type could be different than the type we're working with, but probably not. We could (even if it doesn't make much sense):

impl PartialEq<i64> for GeoPoint {
    fn eq(&self, other: &i64) -> bool {
        false
    }
}

println!("{}", GeoPoint::default() == 6); // false

(I lied earlier when I said == must have the same type on the left and right: it's just usually not defined on different types.)

Core Traits

So, we can compare vectors with == because Vec<T> implements PartialEq<Vec<U>> if T implements PartialEq<U>. Or, as it's expressed in Rust code:

impl<T, U> PartialEq<Vec<U>> for Vec<T> where
    T: PartialEq<U>

Core Traits

But (from the docs, I learned) Vec<T> also implements PartialEq<[U; N]> where T: PartialEq<U>. So we should be able to compare a vector and array:

let v: Vec<i64> = Vec::from([1, 2, 3]);
let a: [i64; 3] = [1, 2, 3];
println!("{}", v == a); // true

Core Traits

Another common trait that we have been using: From. It is the conventional way to express conversions from one type to another, along with Into.

We expect to be able to convert and construct an instance of T from a U where it makes sense.

let u: U = …;
let t1 = T::from(u);
let t2: T = u.into();

Core Traits

We just used it in the last example:

let v: Vec<i64> = Vec::from([1, 2, 3]);

This worked because Vec<T> implements From<&[T; N]>. Or:

let v: Vec<i64> = [1, 2, 3].into();

From comes up a lot: Rust doesn't usually do automatic type conversions (type coercion) for us (except when it does).

Core Traits

One more: Drop. If it's implemented, Rust automatically calls .drop() before the memory is freed: it's a destructor.

If you need to do some custom cleanup before the value is destroyed (e.g. drop a network connection, close a file), you can implement Drop.

Errors

Rust has several ways it represents errors that might occur.

Let's explore…

Errors

The Option<T> type is a Rust enum (like a C union, but safe) that can be used to represent a value that might be missing.

It can can be either Some(t) representing a value, or None representing a missing value: exactly like Haskell's Maybe.

Errors

A Option value can be used to represent a possibly-missing value (like the return value of find_elt in exercise 8).

We can use methods like .is_some() and .is_none() to check its state, or more beautifully, pattern-match it:

let v1: Vec<i32> = Vec::from([4, 5, 2, 8, 7, 3, 1]);
let pos: Option<usize> = find_elt(&v1, 8);
match pos {
    Some(p) => {
        // pos: Option<usize> and p: usize
        println!("found at position {}", p);
    }
    None => {
        println!("it's not there");
    }
}

Errors

There are also a lot of convenience functions on Option. e.g. .map that lets us work with Option values like the Haskell Maybe monad: chain calculations, but stop if there's failure.

let pos = find_elt(&v1, 8);
let next_pos = pos.map(|p| p + 1);
println!("{:?} {:?}", pos, next_pos);

Errors

But Option doesn't feel like an error value: just a value that deliberately might be missing.

The Result<T, E> type represents the result of an operation that we don't expect fail, but might. If it succeeds, we get a Ok(T), if not an error of type Err(E).

Errors

Probably think of Option like a way to represent a null/missing value, and Result as a way to do something like an exception. But they are structurally similar.

Errors

As an example of somethere where many errors are possible, let's make an HTTP request with the reqwest library. First, we need to depend on it in Cargo.toml:

[dependencies]
reqwest = { version = "0.12.20", features = ["blocking"] }

Or in the shell:

cargo add reqwest -F blocking

Errors

Then we see that the get function returns a Result<Response, reqwest::Error>: it might fail if the server is down, etc.

let res = reqwest::blocking::get("https://ggbaker.ca/");

Errors

Again, we can pattern-match our way out of it:

let response: reqwest::blocking::Response;
match res {
    Err(e) => {
        println!("Request failed: {}", e);
        return;
    }
    Ok(r) => {
        response = r;
    }
}
let status = response.status();
println!("Status code {}.", status);

Errors

We still might have received a response but got a 404 not found. If we want the page contents, there's another possible failure: Response.text() returns a Result<String, reqwest::Error>. We could pattern-match that, but I'm going to:

if status == reqwest::StatusCode::OK {
    // 200 OK, so assume the body is there
    println!("Length: {}.", response.text().unwrap().len());
}

Errors

The .unwrap() method will return the T from an Ok(T), but panic (i.e. the program crashes) if we have an Err.

It can be used in a case like this where we have checked the result in some other way.

Errors

Handling errors in Rust can feel a little tedious.

The Result values must be used: there's a compiler warning if we don't actually check their values. The if res.is_ok() or pattern match gets verbose quickly.

In some sense this isn't tedium, the language is pointing out what you should have been doing in all of your code. Rust just reminds us where failures might happen and forces us to deal with it.

Errors

But there's a common situation made easy: if we're in a function that returns a Result, we might want to just pass the Err up and let the caller handle it.

The ? operator can be used on any Result. The semantics: if Err, return the Err immediately; else .unwrap() and continue.

Errors

So fetching a URL body can become (with a few use to avoid fully-qualified :: everywhere):

use reqwest::blocking::get;
use reqwest::Error as ReqwestError;
fn url_fetch(url: String) -> Result<String, ReqwestError> {
    let response = get(url)?;
    let text = response.text()?;
    Ok(text)
}

… but now whoever calls url_fetch has to deal with possible errors.

Errors

Or even:

fn url_fetch_2(url: String) -> Result<String, ReqwestError> {
    Ok(get(url)?.text()?)
}

The ? is subtle and is easy to miss when reading code: keep your eyes open.

Errors

One final expression of error in Rust: panicking.

In truly-unexpected situations, Rust code can panic. If things are really unrecoverable, panicking causes the program to crash and exit immediately.

Errors

I said earlier that .unwrap() panics if the Result is not Ok. This is tempting, but is not the way to deal with Results:

let url = "https://ggbaker.ca/";
let response = get(url).unwrap();
let body = response.text().unwrap();

You don't know that those will both succeed.

Errors

But you may be in a situation where you know more than the compiler about your results. In particular, maybe you have already checked a condition that you know is synonymous with .is_ok().

This is perfectly reasonable:

if response.text().is_ok() {
    println!("Length: {}.", response.text().unwrap().len());
}

Errors

Or maybe a conversion where you know it will succeed (a string .parse here):

let number: i64;
if user_input.len() < 7 && is_all_digits(&user_input) {
    number = user_input.parse().unwrap();
} else {
    println!("Bad user. You get nothing.");
    number = 0;
}

Basically, your code shouldn't panic in any reasonable circumstance.

Errors

The ways we see errors in Rust:

Option<T>: not really an error, but a possibly-missing T. Some(T) or None
Result<T, E>: usually Ok(T), but an exception-like failure possible with an Err(E).
Panic. Don't panic unless it's unexpected and unrecoverable. Can't be caught. *

[* almost]