The fundamental problem with manual memory management (like C and C++): it's often not clear whose responsibility it is to free the heap variables. If they are referenced in a data structure, will it free them in its destructor? Do we have to keep a reference to free them?
In other words, who owns the memory? If we had a rule about who owns the variables, we would know who must delete them.
In C++, the unique_ptr
enforces ownership: whatever code holds the unique_ptr
is responsible for that heap memory, and it's unique by design.
So, C++ allows us to enforce ownership of heap memory, but doesn't force us.
Garbage collectors (and reference counting, and shared_ptr
) side-step the problem by detecting memory that is no longer being used at runtime. These all have a small but non-zero cost.
That's often okay: garbage collectors are fast, and most code isn't that time-sensitive.
What if a language forced us to keep track of ownership, and kept us safe that way? There would be no runtime cost (from garbage collecting or reference counting) and no chance of losing all references to heap memory.
The Rust programming language was designed with ownership
We will be using it as an example of (1) another language where we can apply the concepts from the previous section of the course, and (2) how a language can be designed to keep us safe (in ways we will see through this section of the course).
Rust is much closer to languages you already know than Haskell. Rust is…
Additional resource: The Rust Book, a free online language intro.
A Rust hello world:
fn main() { println!("Hello, world!"); }
If saved as hello.rs
, it can be compiled and run:
rustc hello.rs ./hello
Or we can use Rust's build system Cargo. In your project directory, create a config file Cargo.toml
(can be created with cargo init
), a directory src
and main.rs
:
Cargo.toml src └── main.rs
But then, simply:
cargo run
One first tool: the println!
macro. It prints a line of text, using {}
to mark places for values to be inserted, and {:?}
for more debugging-style output.
println!("Hello, world!"); println!("The result: {}", 3 + 4); println!("A {} B {} C", 18 / 2, "test"); println!("A {:?} B {:?} C", 18 / 2, "test");
Hello, world! The result: 7 A 9 B test C A 9 B "test" C
Also note: semicolons at the end of statements.
Rust has variables, as you'd expect in an imperative language, with type inference when declaring and initializing.
let count = 12; let length = 12.345; let message = "Hello world";
This creates three variables: an integer, a floating point, and a string.
We can also be explicit about the type when creating a variable:
let count_int: i64 = 12; let count_byte: u8 = 12; let length_dbl: f64 = 12.345; let length_sgl: f32 = 12.345; let message: &str = "Hello world"; let another_int: i64; another_int = -1234;
It's idiomatic in Rust to not explicitly give the type unless it's necessary to disambiguate.
[Note: another_int
is created here without being initialized.]
Basic types in Rust are explicit about their size (in bits): i8
, i16
, i32
, i64
, i128
for signed integers; u8
, u16
, u32
, u64
, u128
for unsigned; f32
, f64
for floating-point.
There are also integer types that are the size of a memory address on the system architecture: isize
and usize
. These types are used where you are referring to stuff in memory (e.g. array sizes or indexing).
Rust characters are single-quoted and are Unicode scalar values (≈ Unicode code points ≈ Unicode characters).
let letter = 'A'; let emoji = '👍'; let another_letter: char = 'B';
Also, booleans:
let a = true; let b: bool = false; let c = a && b; // logical and let d = a || b; // logical or let e = !a; // negation
Tuples in Rust are a way to wrap up multiple values in a single variable/argument/etc.
let vals1 = (80, true); let vals2: (i64, bool) = (120, false); let vals3 = (80, 1.234); let vals4 = ("Hello world", 123, 4.5678);
let vals1 = (80, true); let vals2: (i64, bool) = (120, false); let vals3 = (80, 1.234); let vals4 = ("Hello world", 123, 4.5678);
Both vals1
and vals2
have the same type here ((i64,bool)
), but vals3
has a different type ((i64,f64)
).
So, you can compare vals1 == vals2
but vals1 == vals3
won't compile.
Basically, the same rules as Haskell.
If we need to get at elements of a tuple, we index with a
like accessing a property in most languages:.
let vals = (80, true); println!("first {}", vals.0); // first 80 println!("second {}", vals.1); // second true
Arrays contain multiple values of the same type, but unlike most languages the size of the array is part of the type. That is, these have the same type: array of four f64
(written [f64; 4]
):
let lengths = [1.2, 3.4, 5.6, 7.8]; let widths = [12.0, 34.0, 56.0, 78.0];
So, something like lengths == widths
makes sense (and returns false
because the elements are different).
But these have different types:
let shorter = [1, 2, 3]; let longer = [4, 5, 6, 7, 8, 9];
Or equivalently:
let shorter: [i64; 3] = [1, 2, 3]; let longer: [i64; 6] = [4, 5, 6, 7, 8, 9];
The number of elements of an array is fixed and must be known at compile-time. (But slices and vectors exist: more later.)
Indexing array elements is with []
, as you might expect:
let lengths = [1.2, 3.4, 5.6, 7.8]; let n: usize = 2; println!("{} {}", lengths[0], lengths[n]); // 1.2 5.6
Array indicies must be type usize
: an unsigned integer big enough to refer to a memory location in this architecture.
But variables aren't quite like variables in other languages. They can't vary (yet):
let value: i64; value = 1; value = 2;
The first two lines here are fine, but the last one causes a compilation error:
cannot assign twice to immutable variable `value`
Variables in Rust are immutable by default: you are allowed to assign to them only once. (In the initializer or later, but only once.)
A variable must be declared mutable if it will be changed after initial assignment:
let mut value: i64; value = 1; value = 2; println!("{}", value); // prints 2
If we declare a variable mut
but don't assign to it multiple times, the compiler warns us:
warning: variable does not need to be mutable
Maybe this is a first hint: the Rust compiler is very careful when handling our code.
Rust also has the concept of constants:
const MESSAGE: &str = "Hello"; let input = get_user_input();
A constant must be known at compile time. An immutable variable can be determined as the program runs, but we promise it won't change after initial assignment.
Basic control flow structures are similar to other imperative languages. The if
statement is spelled like C/Java/C#, but without the parens around the condition:
let count = 3; if count < 10 { println!("A small number"); } else if count < 100 { println!("A large number"); } else { println!("A vary large number"); }
Similarly, the while
loop is written like C/Java/C#, without the parens:
let mut i = 1; while i < 100 { i *= 2; } println!("{}", i);
The for
loop is primarily about iterating through a collection (technically, anything that implements the Iterator trait).
let values = [4, 5, 6, 7]; for v in values.iter() { println!("v is {}", v); }
Output:
v is 4 v is 5 v is 6 v is 7
But if you want a counting
for
loop, there's a convenient way to create a range:
for i in 1..4 { println!("i is {}", i); }
Output:
i is 1 i is 2 i is 3
Note: includes the range start, but not the end (i.e. not 4 here).
But there is an inclusive range, inserting an
:=
for i in 1..=4 { println!("i is {}", i); }
Output:
i is 1 i is 2 i is 3 i is 4
Rust makes a subtle distinction that isn't always obvious in other languages: statements and expressions are different. Statements end with a semicolon; expressions do not.
This usage of if
contained expressions (in the {…}
), and was therefore behaved like an expression.
if count < 10 { println!("A small number"); } else if count < 100 { println!("A large number"); } else { println!("A vary large number"); }
But an if
in Rust is an expression that can return a result: just make the {…}
contents expressions.
let result: i64; result = if count < 10 { 100 } else { 10 * 100 };
Or the content of the {…}
can be more than a simple expression, as long as it ends with an expression (i.e. no semicolon) that is the result
.
result = if count < 10 { 100 } else { let x = 10; x * 100 };
In Rust, anywhere you need an expression, you can use a code block that has an expression (no semicolon) as the last thing it does:
result = { let x = 10; x * 100 } + 10;
We have already seen the main
function, but we're going to need more.
Functions are defined with the fn
keyword. Argument and return types must be given explicitly. Local variables as-expected.
fn add_three(x: i64, y: i64, z: i64) -> i64 { let result = x + y + z; return result; }
But we don't have to return
explicitly. If the last thing in a function is an expression (i.e. calculation with no trailing ;
), that will be the returned. This is equivalent, and more idiomatic:
fn add_three_2(x: i64, y: i64, z: i64) -> i64 { x + y + z }
We can also write anonymous functions as well. They are called closures in the Rust docs, and do function as closures. The syntax is like:
|m| m%n==0 |a, b| a + b
A quick aside, so we have another type to work with…
The Rust Vec
type is similar to an array, but can be modified in-place: have elements added, removed.
The Vec
type also provides a lot of convenient methods to manipulate it.
An empty vector can be created with Vec::new
, or we can convert an array (or other types) to a vector with Vec::from
. These have a same results: creating a vector of u8
values 1, 2, 3.
let mut values = Vec::new(); values.push(1u8); values.push(2u8); values.push(3);
let mut values: Vec<u8> = Vec::from([1, 2]); values.push(3);
Some new language syntax we just saw: types can contain static methods accessed by
, and instances can contain methods accessed by ::
:.
let mut values = Vec::new(); values.push(1u8); values.push(2u8); values.push(3);
The Rust convention is to have a static method Type::new
as a constructor (i.e. function that returns a new instance of the type).
Some more useful things a Vec
can do:
println!("{}", values[0]); // indexing println!("{}", values == Vec::from([1, 2])); // comparison for v in values.iter() { // iteration println!("{}", v); } values.insert(0, 10); // insert value println!("{:?}", values); // debug-format printing
1 false 1 2 3 [10, 1, 2, 3]
As soon as we have functions, we have a question about large
function arguments. If we have a massive data structure as an argument, we don't want to copy it because that's expensive.
How can we share values around our code without copying them all the time?
Rust (like other languages) has the concept of references. The &
is used to create a reference to a variable, and used to annotate the type to mean a reference to
. Here, n2
, n3
are type &i32
and are references to n1
.
let n1 = 1234; // literal i32 value 1234 let n2: &i32 = &n1; // a &i32 referring to n1 let n3 = &n1; // a &i32 referring to n1 let n4 = n1; // another i32 variable containing a copy of 1234 println!("{} {} {} {}", n1, *n2, *n3, n4);
All are 1234
when printed. Note dereferencing with *
.
Here is similar C++ code:
int n1 = 1234; int* n2 = &n1; int* n3 = &n1; int n4 = n1; cout << n1 << *n2 << *n3 << n4 << endl;
We can follow any reference to modify the value:
*n3 = 4567; cout << n1 << *n2 << *n3 << n4 << endl;
Here, n4
is still 1234, but the rest are 4567.
We can try something similar in Rust:
let mut n1 = 1234_i64; let n2 = &n1; n1 = 4567;
This fails to compile with this error on line 3:
cannot assign to `n1` because it is borrowed
In Rust, every value is owned in a specific way. Whoever owns the value can read/write it (with some details to come later). The compiler enforces this to ensure our memory is accessed in a safe
way.
In this case: creating a reference to a value temporarily gives responsibility for that value to the reference. The reference borrows the value or borrows ownership
of the value.
You can do one of: use the original value; have several immutable references to it; have one mutable reference. This works:
let mut n1 = 1234; let n2 = &mut n1; // note: a mutable reference *n2 = 4567; println!("{}", *n2); //println!("{} {}", n1, n2); // "cannot borrow `n1` as immutable because it is also borrowed as mutable"
The word borrow
here is appropriate: when a reference is finished with a value, it's given back. This also works:
let mut n1 = 1234; { let n2 = &mut n1; *n2 = 4567; println!("{}", *n2); // prints 4567 } // n2 is out of scope here, so no longer has a reference n1 = 7890; println!("{}", n1); // prints 7890
At each point in the code, exactly one variable had mutable ownership of the actual memory contents.
Multiple immutable references worked above, but multiple mutable references fail:
let mut n1 = 1234; let n2 = &mut n1; let n3 = &mut n1;
Compilation fails on line 3:
cannot borrow `n1` as mutable more than once at a time
The idea of ownership applies to values passed to functions as well. When a value is passed to a function, ownership is given permanently to the function. Or, ownership moves to the function: is transferred permanently.
Let's work with vectors, which have more interesting operations we can test. We can easily add up some values, using tricks from the functional world:
fn sum_vec(values: Vec<i64>) -> i64 { values.iter().fold(0, |a, b| a + b) }
Note: |a, b| a + b
is an anonymous two argument function that returns a+b
. We can use this:
let numbers = Vec::from([1, 2, 3, 4]); println!("{}", sum_vec(numbers)); // prints 10
But one line later:
let numbers = Vec::from([1, 2, 3, 4]); println!("{}", sum_vec(numbers)); // prints 10 println!("{:?}", numbers);
On the third line:
borrow of moved value: `numbers`
Our main code no longer owns the Vec
: it was given to the function, and it's not ours (to lend to println!
) anymore.
But references will get us out of the problem. We can (1) lend the Vec
to a reference, (2) move the reference to the function so when the function exits, (3) the reference is destroyed, and (4) ownership returns to the main code.
Let's try again…
Only changes: argument changed from Vec
to &Vec
, and argument became a reference: &numbers
.
fn sum_vec(values: &Vec<i64>) -> i64 { values.iter().fold(0, |a, b| a + b) }
let numbers = Vec::from([1, 2, 3, 4]); println!("{}", sum_vec(&numbers)); // prints 10 println!("{:?}", numbers); // prints [1, 2, 3, 4]
Now ownership stays with the main code, except while sum_vec
is running (and the main code is waiting). The compiler proves to itself that everything is okay, so we can do this.
The same ownership rules apply to mutable reference arguments: only one mutable reference may exist at a time.
A function can ask for a mutable reference:
fn append_sum(values: &mut Vec<i64>) { let sum = values.iter().fold(0, |a, b| a + b); values.push(sum); }
Now we must give a mutable reference: if the variable isn't mut
or the reference isn't &mut
, this won't compile.
let mut numbers = Vec::from([1, 2, 3, 4]); append_sum(&mut numbers); println!("{:?}", numbers); // prints [1, 2, 3, 4, 10]
There is no question in Rust whether or not a function can modify its argument. If the value or an immutable reference is passed: no. If a mutable reference is passed: yes.
Rust has these rules around ownership, from The Rust Book:
You can return a value from a function: you give ownership away. You can pass a value to a function: you give the function ownership.
There are no worries about freeing memory: when the owner is destroyed, Rust knows (at compile-time) to free the value.
And these rules for borrowing, from The Rust Book:
If the Rust compiler can't prove the reference will always be valid, the code won't compile.
Rust keeps us safe from multiple mutable aliases, but still allows temporary borrowing into a function. Copying references is cheap, but then borrowing rules come into play.
We can be sure of which code modifies values: only if it has a mutable reference.
In the discussion of references and ownership, there was an omission about what gets moved.
Suppose we have some simple functions that work on integers and vectors:
fn print_int(n: i64) { println!("{:?}", n); } fn print_vec(v: Vec<i64>) { println!("{:?}", v); }
Note: arguments are values, not references.
And we use them:
let n: i64 = 7; let v: Vec<i64> = Vec::from([1, 2, 3, 4]); println!("n: {:?}", n); println!("v: {:?}", v); print_int(n); print_vec(v); println!("n: {:?}", n); //println!("v: {:?}", v); // "borrow of moved value: `v`"
Why is ownership of a Vec
moved, but an i64
not?
Rust makes a distinction between types that can be easily
copied and ones that can't.
The idea: if copying them is inexpensive, we don't have to worry about ownership and just let the language make the copy automatically when needed *.
Specifically, types that implement the Copy
trait are implicitly copied when they are passed around like this. (Trait
in Rust ≈ typeclass in Haskell or interfaces in other languages: more soon.)
Non-Copy
types have their ownership given away when they are assigned to a different variable (or function argument).
The simple scalar types (like i64
) are Copy
. Vectors are not.
The same happens for variable assignment:
let n1: i64 = 7; let v1: Vec<i64> = Vec::from([1, 2, 3, 4]); let n2 = n1; // implicitly a copy, because i64 is Copy let v2 = v1; // implicitly a move, because Vec is not Copy println!("n1: {:?}", n1); println!("n2: {:?}", n2); //println!("v1: {:?}", v1); // "borrow of moved value: `v1`" println!("v2: {:?}", v2);
Copy
is implemented for values that can be duplicated by duplicating the bits they have on the stack.
The Vec
keeps most of its data on the heap: copying its stack value would create multiple references to that heap memory.
If you really want to duplicate a value, there's another trait for that: Clone
. It guarantees a .clone()
method that safely duplicates a value.
We have to call .clone()
explicitly if we want it to happen: copying is implicit but cloning is not, because cloning might be expensive.
Finally, we can use a value in multiple places if we really need to.
let v1: Vec<i64> = Vec::from([1, 2, 3, 4]); let v2 = v1.clone(); // a full copy of v1 print_vec(v1.clone()); println!("v1: {:?}", v1); println!("v2: {:?}", v2);
[For print_vec
, borrowing through a non-mut
reference would make much more sense.]
A struct in Rust can be used to group multiple values together (like a tuple). Each field in the struct gets named.
#[derive(Debug, Clone)] struct GeoPoint { pub lat: f64, pub lon: f64, pub ele: i32, }
This create a struct GeoPoint
that has three fields (lat
, lon
, ele
), each with its own type.
#[derive(Debug, Clone)] struct GeoPoint { pub lat: f64, pub lon: f64, pub ele: i32, }
All of these fields are public (pub
): they can be accessed from outside code.
The #[derive…]
line gets us free implementations of Debug
(printing with {:?}
) and Clone
(a .clone()
method).
We can now create an instance of the struct and work with it.
let p = GeoPoint { lat: 49.267, lon: -122.967, ele: 68, }; println!("{:?}", p);
GeoPoint { lat: 49.267, lon: -122.967, ele: 68 }
Like any other variable, structs can be mutable or not.
let mut p = GeoPoint { lat: 49.267, lon: -122.967, ele: 68, }; p.ele = 168; println!("{:?}", p);
GeoPoint { lat: 49.267, lon: -122.967, ele: 168 }
Without mut
, setting p.ele
would fail.
We can define methods in structs (or other types we define):
impl GeoPoint { fn antipode(&self) -> GeoPoint { GeoPoint { lat: -self.lat, lon: -self.lon, ele: self.ele, } } }
This creates a method .antipode
that returns a new GeoPoint
.
Then this works like we would expect a method to work:
let p = GeoPoint { lat: 49.267, lon: -122.967, ele: 68, }; println!("{:?}", p.antipode());
GeoPoint { lat: -49.267, lon: 122.967, ele: 68 }
The self
argument is assumed to be the type of the struct we're working with, and given as a short-form. We can be explicit if we want, and these two are equivalent:
fn antipode(&self) -> GeoPoint {…}
fn antipode(self: &GeoPoint) -> GeoPoint {…}
The argument name self
is special: it's the receiver argument. When it's included, the function becomes a method.
The receiver argument can be the value itself (self
) or a reference (&self
) or a mutable reference (&mut self
). *
Methods can change the struct (if they have a mutable reference) and take additional arguments.
impl GeoPoint { fn go_up(&mut self, height: i32) { self.ele += height; } }
With p
defined as mut
,
println!("{}", p.ele); // possible because .ele is pub p.go_up(32); println!("{}", p.ele);
68 100
If we implement a function on a type that does not have a receiver argument (self
), it is an associated function (≈ static method). It's conventional (but not required) to have a new
function as a constructor:
impl GeoPoint { fn new(lat: f64, lon: f64) -> GeoPoint { GeoPoint { lat: lat, lon: lon, ele: 0, } } }
Associated functions are accessed from the type with ::
let p = GeoPoint::new(49.267, -122.967); println!("{:?}", p);
GeoPoint { lat: 49.267, lon: -122.967, ele: 0 }
So
for an associated function; TypeName::foo
for a method.instance.bar
There is a shortcut when creating structs: if we have local variables with names that match the fields, we can use them directly. So our ::new
could equivalently be:
fn new(lat: f64, lon: f64) -> GeoPoint { GeoPoint { lat, lon, ele: 0 } }
Note that the self
argument is like any other in the way it handles ownership. If you take self
(not a reference: the actual value), ownership moves to the method.
fn consume(self) { println!("The GeoPoint is mine now!"); }
This fails since GeoPoint
is not Copy
:
let p = GeoPoint::new(49.267, -122.967); p.consume(); println!("{}", p); // borrow of moved value: `p`
There are no classes in Rust: we get structs.
Also, there's no concept of inheritance
on structs. We can create a new struct that uses another, but not inherit from it (like C++, Java, Python, etc).
But we do get traits. A trait defines a set of things a type must have (methods, etc). It is very much like a typeclass in Haskell or interface in Java, C#.
Our structs (or other types we define) can implement traits that are defined by the Rust standard library, or we can create our own traits.
Rust uses traits heavily to describe various aspects of the language.
We have seen the Copy
and Clone
as traits. Copy
is special because of the way it affects ownership moves vs duplication. But Clone
is a more normal trait: it guarantees that the .clone()
method exists.
We have also implicitly seen the traits Display
and Debug
traits. They have made it possible to print things:
"{}"
uses the Display
trait's .fmt
method."{:?}"
uses the Debug
trait's .fmt
method.Most things implement Debug
, which is supposed to be programmer-friendly output. If you want to be printable in a user-friendly way, that's Display
.
For our GeoPoint
, we got a Debug
implementation from the line
: it calls a macro that builds a reasonable #[derive(Debug, Clone)]
Debug::fmt
.
We can implement Display
to make it print nicely:
use std::fmt; impl fmt::Display for GeoPoint { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{}°N {}°E @ {}m", self.lat, self.lon, self.ele) } }
impl fmt::Display for GeoPoint { fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!(f, "{}°N {}°E @ {}m", self.lat, self.lon, self.ele) } }
impl fmt::Display for GeoPoint {…}
: this block implements fmt::Display
for GeoPoint
.fmt
must be this: that's part of the trait.write!
works like println!
, except the results a returned as a fmt::Result
.Or nicer if I can use >1 line:
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result { write!( f, "{}°{} {}°{} @ {}m", self.lat.abs(), if self.lat < 0.0 { 'S' } else { 'N' }, self.lon.abs(), if self.lon < 0.0 { 'W' } else { 'E' }, self.ele ) }
Now we can do this:
let p = GeoPoint::new(49.267, -122.967); println!("{:?}", p); println!("{}", p); let s = format!("build a string {} !", p); println!("{}", s);
… and get this output:
GeoPoint2 { lat: 49.267, lon: -122.967, ele: 0 } 49.267°N 122.967°W @ 0m build a string 49.267°N -122.967°E @ 0m !
The code for fmt
did this:
write!(f, "{}°N {}°E @ {}m", self.lat, self.lon, self.ele)
Formatting with {}
worked, so the f64
s and i64
must implement Display
. How do we know? The docs.
Another simple trait from the Rust standard library: Default
. Types that implement Default
can construct themselves with a default value:
impl Default for GeoPoint { fn default() -> GeoPoint { GeoPoint { lat: 0.0, lon: 0.0, ele: 0, } } }
Or we can define and implement our own traits.
pub trait Upable { fn move_up(&mut self, height: i32); fn copy_up(&self, height: i32) -> Self; } impl Upable for GeoPoint { fn move_up(&mut self, height: i32) { self.ele += height; } fn copy_up(&self, height: i32) -> GeoPoint { let mut dup = (*self).clone(); dup.move_up(height); dup } }
Using those:
let mut p = GeoPoint::default(); println!("{:?}", p); p.move_up(3); println!("{:?}", p); println!("{:?}", p.copy_up(3));
GeoPoint { lat: 0.0, lon: 0.0, ele: 0 } GeoPoint { lat: 0.0, lon: 0.0, ele: 3 } GeoPoint { lat: 0.0, lon: 0.0, ele: 6 }
We have already seen generic types in Rust: a Vec
can hold any type: we can have a Vec<i64>
or Vec<String>
or Vec<Vec<i64>>
and they are all distinct types.
Vectors are generic type, and take a type parameter that indicates the type they hold.
When instantiating a generic type, we need to specify the type parameter somehow: by giving it explicitly, or it can be implied by the initializer, or type-inferred by they usage.
All of these create a Vec<i64>
:
let v1: Vec<i64> = Vec::new(); // explicit type let v2 = Vec::<i64>::new(); // type parameter in initializer let v3 = returns_a_vec_i64(); // implied by initializer let mut v4 = Vec::new(); // type-inferred in next line... v4.push(1234_i64);
We can create our own generic functions:
fn pair_with_one<T>(a: T) -> (T, i64) { (a, 1) }
The <T>
indicates that T
is a type parameter in this definition. Like with Haskell, we expect the types to work themselves out as specified:
let pair1: (&str, i64) = pair_with_one("hello"); let pair2: (f64, i64) = pair_with_one(1.2345);
fn pair_with_one<T>(a: T) -> (T, i64) { (a, 1) }
In this code, we know a
is a function argument. The <T>
works like an argument to the definition of the function: because the <T>
is there, there's a type T
that can be used for the rest of the code block.
Or a generic type with a type parameter:
#[derive(Debug)] struct TwoVecs<T> { pub first: Vec<T>, pub second: Vec<T>, } impl<T> Default for TwoVecs<T> { fn default() -> TwoVecs<T> { TwoVecs { first: Vec::default(), // type-inferred as Vec<T> second: Vec::default(), } } }
Our type is used similarly to Vec
: specify the type parameter somehow, but holds any type we want.
let mut tv = TwoVecs::<i32>::default(); let mut tv = TwoVecs::default(); tv.first.push(1234); println!("{:?}", tv); // TwoVecs { first: [1234], second: [] }
We quickly hit a problem similar to what we saw in Haskell:
fn is_larger<T>(a: T, b: T) -> bool { a > b }
This fails to compile: we have no guarantee that >
is defined on this type.
binary operation `>` cannot be applied to type `T`
The solution is to specify a trait that we demand the type implement.
fn is_larger<T: Ord>(a: T, b: T) -> bool { a > b }
This function now takes two arguments (1) of the same type, (2) if that type implements Ord
.
println!("{}", is_larger(2, 3)); println!("{}", is_larger("two", "one"));
This would fail with
because the trait bound `GeoPoint: Ord` is not satisfied
GeoPoint
does not implement Ord
.
is_larger(GeoPoint::Default(), GeoPoint::new(0.0, 0.0))
Many core ideas in Rust are implemented through traits that let a programmer access those features. If you want to know about a Rust type, once of the critical things to ask: what traits does it implement?
We have already seen Copy
which has a huge difference in the way a type behaves with respect to moving/ownership.
For example, what can Vec do (as described by the docs)?
methodsdescribed in the docs.
trait implementations.
Methods from Deref<Target = [T]>because we can do
*vec
to get the underlying array (slice).Vec
implements Ord
and IntoIterator
, so we expect this to work:
let vec = Vec::from([1, 2, 3]); println!("{}", vec <= vec); for e in vec.into_iter() { println!("{}", e); }
The for
loop actually works on anything that implements IntoIterator
: we could have omitted the .into_iter()
and it would be implicitly inserted.
Because so much of the language is accessible by the standard library traits, we can make our own types behave like the built-in types in many ways.
For example, the std::ops
and std::cmp
traits can let us overload basic operators.
We can make our GeoPoint
work with
(i.e. operator overloading) by implementing +
Add
:
use std::ops::Add; impl Add for GeoPoint { type Output = GeoPoint; fn add(self, other: GeoPoint) -> GeoPoint { GeoPoint { lat: self.lat + other.lat, lon: self.lon + other.lon, ele: self.ele + other.ele, } } }
Then we can add GeoPoint
s:
let p = GeoPoint::new(49.267, -122.967); let mut offset = GeoPoint { lat: 1.0, lon: -1.0, ele: 10, }; println!("{:?}", p + offset);
Compare AddAssign
that corresponds to +=
.
The ==
operator maps to the PartialEq
trait. It has a type parameter: what type is the right-hand-side of the comparison.
impl PartialEq<GeoPoint> for GeoPoint { fn eq(&self, other: &GeoPoint) -> bool { self.lat == other.lat && self.lon == other.lon } }
Then:
let p = GeoPoint::new(49.267, -122.967); println!("{}", p == p); // true println!("{}", p == GeoPoint::default()); // false
The right-hand-side type could be different than the type we're working with, but probably not. We could (even if it doesn't make much sense):
impl PartialEq<i64> for GeoPoint { fn eq(&self, other: &i64) -> bool { false } }
println!("{}", GeoPoint::default() == 6); // false
So, we can compare vectors with ==
because Vec<T>
implements PartialEq<Vec<U>>
if T
implements PartialEq<U>
. Or, as it's expressed in Rust code:
impl<T, U> PartialEq<Vec<U>> for Vec<T> where T: PartialEq<U>
But (from the docs, I learned) Vec<T>
also implements PartialEq<[U; N]>
where T: PartialEq<U>
. So we should be able to compare a vector and array:
let v: Vec<i64> = Vec::from([1, 2, 3]); let a: [i64; 3] = [1, 2, 3]; println!("{}", v == a); // true
Another common trait that we have been using: From
. It is the conventional way to express conversions from one type to another, along with Into
.
We expect to be able to convert and construct an instance of T
from a U
where it makes sense.
let u: U = …; let t1 = T::from(u); let t2: T = u.into();
We just used it in the last example:
let v: Vec<i64> = Vec::from([1, 2, 3]);
This worked because Vec<T>
implements From<[T; N]>
. Or:
let v: Vec<i64> = [1, 2, 3].into();
From
comes up a lot: Rust doesn't usually do automatic type conversions (type coercion) for us (except when it does).
One more: Drop
. If it's implemented, Rust automatically calls .drop()
before the memory is freed: it's a destructor.
If you need to do some custom cleanup before the value is destroyed (e.g. drop a network connection, close a file), you can implement Drop
.
Rust has several ways it represents errors
that might occur.
Let's explore…
The Option<T>
type is a Rust enum
(like a C union
, but safe) that can be used to represent a value that might be missing.
It can can be either Some(t)
representing a value, or None
representing a missing value: exactly like Haskell's Maybe
.
A Option
value can be used to represent a possibly-missing value (like the return value of find_elt
in exercise 8).
We can use methods like .is_some()
and .is_none()
to check its state, or more beautifully, pattern-match it:
let v1: Vec<i32> = Vec::from([4, 5, 2, 8, 7, 3, 1]); let pos: Option<usize> = find_elt(&v1, 8); match pos { Some(p) => { // pos: Option<usize> and p: usize println!("found at position {}", p); } None => { println!("it's not there"); } }
There are also a lot of convenience functions on Option
. e.g. .map
that lets us work with Option
values like the Haskell Maybe
monad: chain calculations, but stop if there's failure.
let pos = find_elt(&v1, 8); let next_pos = pos.map(|p| p + 1); println!("{:?} {:?}", pos, next_pos);
But Option
doesn't feel like an error
value: just a value that deliberately might be missing.
The Result<T, E>
type represents the result of an operation that we don't expect fail, but might. If it succeeds, we get a Ok(T)
, if not an error of type Err(E)
.
Probably think of Option
like a way to represent a null value, and Result
as a way to do something like an exception. But they are structurally similar.
As an example of somethere where many errors are possible, let's make an HTTP request with the reqwest library. First, we need to depend on it in Cargo.toml
:
[dependencies] reqwest = { version = "0.11.18", features = ["blocking"] }
Or in the shell:
cargo add reqwest -F blocking
Then we see that the get
function returns a Result<Response, reqwest::Error>
: it might fail if the server is down, etc.
let res = reqwest::blocking::get("https://ggbaker.ca/");
Again, we can pattern-match our way out of it:
let response: reqwest::blocking::Response; match res { Err(e) => { println!("Request failed: {}", e); return; } Ok(r) => { response = r; } } let status = response.status(); println!("Status code {}.", status);
We still might have received a response but got a 404 not found
. If we want the page contents, there's another possible failure: Response.text()
returns a Result<String, reqwest::Error>
. We could pattern-match that, but I'm going to:
if status == reqwest::StatusCode::OK { // 200 OK, so assume the body is there println!("Length: {}.", response.text().unwrap().len()); }
The .unwrap()
method will return the T
from an Ok(T)
, but panic (i.e. the program crashes) if we have an Err
.
It can be used in a case like this where we have checked the result in some other way.
Handling errors in Rust can get tedious.
The Result
values must be used: there's a compiler warning if we don't actually check their values. The if res.is_ok()
or pattern match gets verbose quickly.
But there's a common situation made easy: if we're in a function that returns a Result
, we might want to just pass the Err
up and let the caller handle it.
The ?
operator can be used on any Result
. The semantics: if Err
, return the Err
immediately; else .unwrap()
and continue.
So fetching a URL body can become (with a few use
to avoid fully-qualified ::
everywhere):
use reqwest::blocking::get; use reqwest::Error as ReqwestError; fn url_fetch(url: String) -> Result<String, ReqwestError> { let response = get(url)?; let text = response.text()?; Ok(text) }
… but now whoever calls url_fetch
has to deal with possible errors.
Or even:
fn url_fetch_2(url: String) -> Result<String, ReqwestError> { Ok(get(url)?.text()?) }
The ?
is subtle and is easy to miss when reading code: keep your eyes open.
One final expression of error
in Rust: panicing.
In truly-unexpected situations, Rust code can panic. If things are really unrecoverable, panicing causes the program to crash and exit immediately.
I said earlier that .unwrap()
panics if the Result
is not Ok
. This is tempting, but is not the way to deal with Result
s:
let url = "https://ggbaker.ca/"; let response = get(url).unwrap(); let body = response.text().unwrap();
You don't know that those will both succeed.
But you may be in a situation where you know more than the compiler about your results. In particular, maybe you have already checked a condition that you know is synonymous with .is_ok()
.
This is perfectly reasonable:
if response.text().is_ok() { println!("Length: {}.", response.text().unwrap().len()); }
Or maybe a conversion where you know it will succeed (a string .parse
here):
let number: i64; if user_input.len() < 7 && is_all_digits(&user_input) { number = user_input.parse().unwrap(); } else { println!("Bad user. You get nothing."); number = 0; }
Basically, your code shouldn't panic in any reasonable circumstance.
The ways we see errors
in Rust:
Option<T>
: not really an error, but a possibly-missing T
. Some(T)
or None
Result<T, E>
: usually Ok(T)
, but an exception-like failure possible with an Err(E)
.caught. *
[* almost]