I need some WD-40

Around 2 weeks ago I started learning Rust. I attended an intensive 3-day John De Goes Rust workshop, and it left me itching to learn more. With most of my experience coming in the form of backend programming on the JVM, functional programming with Scala, and webdev in Typescript, this was my first real experience with a Systems level programming language.

And well it's gone by quickly. I looked at my NextJS personal website and felt compelled to "Rewrite it in Rust", which you're now visiting.

And so I got my hands dirty with the Leptos Web Framework, which we'll get into later.

There's a lot cover so let's get into it.

Rustacean evolution

What I Like About Rust (so far...)

Expression based thinking


Understanding Expressions Vs. Statements

Diving into the mechanics of Rust, one of the first things you'll notice is its strong emphasis on expressions rather than statements, a clear distinction from many other languages. But what exactly does that mean, and why should you care?

Expressions are best likened to LEGO blocks. Each piece, or in this case, a chunk of code, has its own unique value. Just as LEGO blocks come together to create a spaceship, castle, or whatever your imagination desires, expressions combine and interact to form more complex values.

Conversely, statements could be seen as the action of placing a LEGO block in a specific spot. It's a crucial action to build your model, but it doesn't constitute a structure on its own. In code, statements perform an action like assigning a value or printing a message, but they don't yield a value in themselves.

Expression Orientation in Java Vs. Rust

To put this into perspective, let's take a look at Java. In Java, 'if' statements are, well, just statements. Here's an example:

public void statements() {
    int x = null;
    if (true) {
        x = 3;
    } else {
        x = 10;
    }
    System.out.println(x);
}

In this code snippet, the 'if' statement modifies the state of a variable based on a condition but does not produce any value in and of itself. The value is then printed to the console, which also doesn't produce any value.

Now, let's pivot to Rust, an expression-oriented language. Most constructs in Rust - excluding declarations - such as blocks, ifs, matches, loops, and functions, are expressions.

If expressions behave like ternary operators in other languages. Meaning, they evalate to the value of the executed branch.

let x: i32 = if true { 3 } else { 10 };

Block expressions establish new scopes and evaluate to the last expression in the block. This lets you create clear demarcations without the need for helper functions:

let x = 0;
let y: i32 = {
    // This variable 'x' shadows and takes precedence over the outer 'x'.
    let x = 3;
    x + 1
};
assert_eq!(y, 4);

If the last expression in a block ends with a semicolon, it will evaluate to Unit (()):

let unit_block: () = {
    let number = 0;
    println!("The number is {}", number);
    number;
};

Loop expressions yield the value they 'break' with.

let z: i32 = {
    let mut i = 0;
    loop {
        i += 1;
        if i == 10 {
            break 42;
        }
    }
};

Functions evaluate to a resultant value. If the return keyword is ommitted - the last expression in a function is the return value.


// Implicit return.
fn add_one(x: i32) -> i32 {
    x + 1
}

Here's a function that combines many different type of expressions to calculate the value of a wallet:

enum Coin {
    Penny,
    Nickel,
    Dime,
    Quarter,
}

enum Bill {
    Washington,
    Jefferson { year: u32 },
    Lincoln,
    Hamilton,
    Jackson,
    Benjy,
}

struct Wallet {
    coins: Vec,
    bills: Vec,
}

// Return value in of wallet in cents.
fn wallet_value(wallet: &Wallet) -> u32 {
    let cents = {
        let mut cents = 0;
        for coin in wallet.coins.iter() {
            cents += match coin {
                Coin::Penny => 1,
                Coin::Nickel => 5,
                Coin::Dime => 10,
                Coin::Quarter => 25,
            }
        }
        cents
    };

    let dollars: u32 = wallet
        .bills
        .iter()
        .map(|bill| match bill {
            Bill::Washington => 1,
            Bill::Jefferson { year } => {
                // This is a rare bill!
                if *year < 1900 {
                    100
                } else {
                    2
                }
            }
            Bill::Lincoln => 5,
            Bill::Hamilton => 10,
            Bill::Jackson => 20,
            Bill::Benjy => 100,
        })
        .sum();

    cents + dollars * 100
}

See how composable these expressions are? wallet_value combines blocks, matches, an if expression, an iterator sum, and a for loop, to calculate the value. You can "follow" each branch neatly to see the sequential flow of the program.

Rust's emphasis on expressions over statements is akin to valuing the whole LEGO model over the single step of placing a block. It's this philosophy that contributes to Rust's expressiveness and ergonomics, making it a powerful tool for developers.

Zero Cost Abstractions


In Rust, most abstractions come with no runtime costs regarding execution speed or memory usage.

One shining example of this is the Iterator trait. Rust allows you to chain together multiple iterator methods to perform intricate transformations on data. Even with such high-level abstraction, the resultant code often matches, or even outperforms, the efficiency of manually written, low-level code.

let squares_of_evens: Vec = {
    (1..)
        .map(|x| x * x)
        .filter(|&x| x % 2 == 0)
        .take(10)
        .collect()
};

Even with its high-level nature, this code matches the performance of a manually written loop

let mut squares_of_evens = Vec::new();
for i in 1.. {
    let square = i * i;
    if square % 2 == 0 {
        squares_of_evens.push(square);
        if squares_of_evens.len() == 10 {
            break;
        }
    }
}

"Virtual-Free" Rust

A typical object-oriented programming concept is the "virtual table" (or vtable), a mechanism employed to support dynamic dispatch. It's how languages like JavaScript, Python, Java, and Scala handle method calls, deciding at runtime which specific version of a method to execute based on the object's actual type.

In contrast, Rust embraces a more efficient approach, sidestepping the need for vtables. It does this through static dispatch, resolving method calls at compile time, instead of waiting until runtime. This leads to faster, more memory-efficient code, as we're not paying the runtime cost of dynamically dispatching method calls.

Take the concept of "virtual methods" in Java using interfaces, for example. These are methods declared in an interface and subsequently implemented by any class using this interface. For instance, if you have an interface, Animal, with a method sound(), and classes Dog and Cat implementing this interface, you are using virtual methods.

public interface Animal {
    void sound();
}

public class Dog implements Animal {
    @Override
    public void sound() {
        System.out.println("Woof!");
    }
}

public class Cat implements Animal {
    @Override
    public void sound() {
        System.out.println("Meow!");
    }
}

public class Main {
    public static void main(String[] args) {
        Animal myDog = new Dog();
        Animal myCat = new Cat();

        myDog.sound(); // Prints "Woof!"
        myCat.sound(); // Prints "Meow!"
    }
}

When you call sound() on an Animal instance, the JVM uses a vtable to determine which version of the method to execute, based on the actual type of the object.

A vtable is essentially an array created by the JVM in memory for each class implementing an interface. Each array entry is a pointer to a method callable by an object of the class. The JVM uses this array to look up method addresses at runtime.

The vtable for our Animal interface might look something like this:

Objectsound() pointer
Dogaddress of Dog.sound()
Cataddress of Cat.sound()

So, when we create a Dog object and call myDog.sound(), the JVM does the following:

  1. It accesses the Dog object's vtable in memory.

  2. It locates the sound() entry in the vtable and retrieves the corresponding pointer.

  3. It uses this pointer to navigate to the memory address where the Dog.sound() method is stored.

  4. Finally, it executes the method.

This process involves dereferencing pointers and introduces a runtime cost due to the extra steps needed to decide which method to execute.

In contrast, Rust sidesteps this process. It accomplishes polymorphism through the use of traits and type parameters, thus avoiding the need for a vtable and the accompanying dynamic dispatch. This concept is a cornerstone of what's known as "zero-cost abstractions" in Rust - writing high-level, readable code without the performance penalties commonly associated with such abstractions in other languages.

Mutation in the Type System


Rust's type system embraces mutation as a first-class citizen.

For one, you have to declare variables as mutable with the mut keyword. This code would yield a helpful error, showing an illegal attempt to mutate a Vec that has not been declared as mutable.

let list = vec![1, 2, 3];
list.push(4);
error[E0596]: cannot borrow `list` as mutable, as it is not declared as mutable
--> 
|
52 |     list.push(4);
|     ^^^^^^^^^^^^ cannot borrow as mutable
|
help: consider changing this to be mutable
|
51 |     let mut list = vec![1, 2, 3];
|         +++

Now when it comes to variables, For some type T you have:

  • T: You are the owner of the data.

  • &mut T: You have EXCLUSIVE write access to the data.

  • & T: You have SHARED read access to the data.

Consider a simple function in Rust that increments a count:

 fn increment_count(count: &mut i32) {
     let value = *count;
     *count = value + 1;
 }

 #[test]
 fn test_inc_count() {
     let mut count = 0;
     let count_ref = &mut count;
     increment_count(count_ref);
     assert_eq!(count, 1)
 }

In this example, the increment_count function takes a mutable reference to an integer (&mut i32). The mutable reference denotes that count can be modified within the function.

Rust forces you to indicate that a reference is mutable when you declare it. This applies to structs, enums, and function parameters. This explicitness provides a lot of safety, and forced documentation.

Fine control over mutability is something that Rust shares with functional languages like Scala. We can draw parallels to a similar function in Scala using a ZIO Ref, a concurrent mutable reference:

 object TestSpec extends ZIOSpecDefault:
     def incrementCount(countRef: Ref[Int]): UIO[Unit] =
         for
             value <- countRef.get
             _     <- countRef.set(value + 1)
         yield ()

     def spec = suite("incrementCount") {
         test("incrementCount should increment the value of the ref by 1") {
             for
                 ref   <- Ref.make(0)
                 _     <- incrementCount(ref)
                 value <- ref.get
             yield assertTrue(value == 1)
         }
     }

In the Scala example, incrementCount is an atomic operation using a ZIO Ref. A Ref allows for safe mutation in a concurrent context. Similiarly to the mut keyword in Rust, when you see the Ref type, it indicates that mutation is bound to take place in the given scope.

Though Rust and Scala have different idioms and philosophies, both provide robust ways to control mutation and ensure safety.

A Ref's concurrent counterpart in Rust would look something like

type Ref = Arc>

Errors as Values


When it comes to handling errors, Rust employs an intriguing and unique approach: it treats errors as values, rather than as control flow primitives. This paradigm is inspired by functional programming languages and leverages Rust's robust algebraic data types (ADTs) to create a powerful error handling system.

In many other programming languages such as Java, Python, and C++, errors are usually dealt with exceptions. An exception is a special kind of object created and thrown when an error occurs, effectively interrupting the normal flow of a program. Control is then transferred to the nearest exception handler in the call stack, which is designed to address the specific error.

Rust, on the other hand, opts for a different approach, treating errors as ordinary data that can be returned by functions and passed around. This approach is centered around the use of ADTs, specifically enums, to model potential error states.

There are two primary types used for error handling in Rust:

  • Option

    • A value that can exist (Some) or is missing (None)

    • Akin to a type-safe null

enum Option {
    Some(T),
    None,
}

// Find the index of the word in the Vec.
fn find_word(words: Vec<&str>, target: &str) -> Option {
    words.iter().enumerate().find_map(
        |(index, &word)| {
            if word == target {
                Some(index)
            } else {
                None
            }
        },
    )
}

#[test]
fn find_word_find_cherry() {
    let words = vec!["apple", "banana", "cherry", "date"];

    let result = find_word(words, "cherry");

    assert_eq!(result, Some(2));
}
  • Result:

    • The result of a computation that may fail.

    • It can either be Ok(T) if the computation was successful, or Err(E) if it failed. E will contain information about what went wrong.

enum Result {
    Ok(T)
    Err(E)
}

fn divide(numerator: f64, denominator: f64) -> Result {
    if denominator == 0.0 {
        Err("Cannot divide by zero".to_string())
    } else {
        Ok(numerator / denominator)
    }
}

#[test]
fn cannot_divide_by_zero() {
    let result = divide(5.0, 0.0);

    assert_eq!(result, Err("Cannot divide by zero".to_string()));
}

A key point of this design choice is to make error handling more explicit and deliberate. Unlike exception-based systems, where it can be easy to overlook handling for an error, the Rust compiler enforces the handling of Result and Option types. Note how their implementation is completely transparent, and requires no custom compiler machinery implement thanks to zero-cost abstractions. This enables programmers to model other expressive error types, such as Inclusive-Or errors, with the same efficiency as Error types in the standard library.

By using ADTs and pattern matching, Rust effectively turns error handling from a control flow problem into a data modeling problem.

Performance of Error Handling

When we consider the performance implications of this design, Rust's Errors as Values approach also provides a noticeable advantage. Exception handling in traditional languages incurs a non-trivial runtime cost. When an exception is thrown, the runtime environment needs to unwind the stack until it finds a suitable exception handler.

In contrast, Rust's approach to handling errors incurs minimal performance penalty. This is because there's no need for stack unwinding or searching for exception handlers. Errors are returned just like any other value, and error handling is done via pattern matching, a compile-time mechanism. The "happy path" and the "error path" are treated uniformly in terms of performance.

  • Happy path: optimal, error-free, successful execution of a program.

  • Error path: execution of a program that encounters an error/exception.

Meaning that when Result is evaluated, the code has an additional check on the discriminant of Result to determine whether it is Ok or Err before continuing.

A caveat is that if exceptions are truly used properly - only for truly exceptional circumstances - then the happy path will not incur any performance penalty. However, this is rarely the case in practice.

So if you have a program that uses exceptions, and the exceptions rarely occur, then exceptions are being put to good use. However, if exceptions are being thrown frequently, then your program could suffer an enormous performance penalty, which Rust opts to avoid.

TLDR: exceptions may allow the happy path to be faster, but at the expense of performance on the error path. There's no free lunch.

Hidden control flow

An understated advantage of Errors as Values over exceptions is predictability and the absence of "hidden" control flow paths. Exceptions can be thrown at many points in a program (and in some languages any value can be thrown), and it can be non-obvious where they should be handled. On the other hand, functions in Rust that can fail have their error types explicitly defined in their signatures, making it clear what errors need to be handled.

I wanna go fast


anyone else?

So we've all heard that Rust is fast, here are some of the reasons why.

Zero-Cost Abstractions

The cornerstone of Rust's performance lies in its mantra of "zero-cost abstractions." The concept is simple: abstractions, which allow us to write clear and concise code, should not come at the cost of runtime performance.

Ownership and Borrowing: The Ultimate Garbage Collector

Garbage collection (GC) is a double-edged sword. On the one hand, it frees developers from manual memory management, preventing a whole class of bugs. On the other hand, GC comes with an overhead, and can introduce unpredictable pauses in a running program.

Rust's unique system of ownership and borrowing eliminates the need for a garbage collector altogether, while still providing the safety guarantees that a GC would. With Rust, memory is managed through a system of ownership with a set of rules that the compiler checks at compile-time. No garbage collector is needed.

By default, objects are stack-allocated, which generally is more efficient than heap allocation. However, Rust also allows explicit heap allocation using constructs like Box. This fine-grained control over memory management allows Rust programs to be incredibly efficient, minimizing runtime overhead and maximizing speed.

Lightweight Concurrency with Async/Await and Tokio

Multithreading and concurrency are critical for modern applications, but managing threads safely is notoriously difficult. Rust's async/await syntax and the Tokio runtime bring the benefits of asynchronous programming to Rust without the usual pitfalls.

Async/await in Rust is a zero-cost abstraction. Unlike in other languages, where async/await can add significant overhead, in Rust the generated code is as efficient as hand-written state machines.

The Tokio runtime further enhances Rust's concurrency story. It's a non-blocking I/O platform for writing asynchronous applications, with a focus on simplicity, speed, and reliability. Tokio makes use of core threads and blocking threads.

As per the docs:

Tokio provides two kinds of threads: Core threads and blocking threads

The core threads are where all asynchronous code runs, and Tokio will by default spawn one for each CPU core

The blocking threads are spawned on demand, can be used to run blocking code that would otherwise block other tasks from running

Together, async/await and Tokio make it possible to write high-performance concurrent code that is still safe and easy to understand.

Power of Optimized Builds

In Rust, you typically develop in "debug" mode, where the compiler prioritizes compilation speed and debug information. However, when you're ready to release your application, you switch to "release" mode, where the compiler takes more time to apply optimizations that make your code run faster.

The difference between the two can be astonishing. Optimized --release builds are often 10 to 100 times faster than debug builds.

Oxidation underway


Like a metal refined by the elements, my journey into Rust has ignited a transformative process. Exploring the realms of Rust's expression-based thinking, zero-cost abstractions, and the empowering mutation in its type system has deepened my understanding of how computers and programs truly operate. Rust has been designed in face of past grievances (e.g. C, C++, Java) and future challenges.

I'm sure I'll write some on Rust's downsides, and want to share my experience building this website using the Leptos Web Framework. See ya then.

Built with Rust & Leptos

2024 Nico Burniske. All rights reserved.