Dynamic Analysis

The Rust programming language does not prevent you from writing invalid code, it just makes it a lot harder. The default state is that code is subject to the borrow checker, which ensures that it is memory-correct. However, sometimes you do want to write some code that opens up a little hatch and lets you take on the burden of validating that it is correct yourself: unsafe code.

A typical Rust program does not use a lot of unsafe code. It is more the exception that crates use it, and if they do it tends to be in small, contained spaces. Rust does not eliminate the ability to shoot yourself into the foot, it just forces you to be intentional about it. In languages like C or C++, any piece of code is an unsafe block, it’s like the wild west.

Sometimes, you would like to check if the unsafe code you have written is in fact valid. This can be a bit tricky, because the thing you are trying to catch is undefined behaviour. For example, reading one byte past an array would not necessarily cause your program to crash, instead you would just read garbage.

One solution here is to use dynamic analysis, where your program is run in a special way (instrumented or emulated) and a higher-level tool validates every action your program takes. If your program triggers any of these undefined behaviours, then you get an error and a description of what it did wrong:

  • Read uninitialized memory
  • Read past memory allocation/stack
  • Write past memory allocation/stack
  • Free memory that is already freed (double free)
  • Forget to free memory (memory leak)

The idea with these tools is that you can enable them when running unit tests, and they will monitor what your code does and give you a diagnostic error when it does anything invalid. Triggering undefined behavior is quite dangerous, it means that your program can break when you switch compilers or it might work because your CPU happens to support certain things (for example, x86 CPUs will let you perform unaligned reads, but other platforms might not, so if your code performs those it will break on other platforms).

For Rust, due to the protections the language offers, we usually don’t have large amounts of undefined behaviour in the first place, so these tools are not usually needed.

There is one tool that is particularily suited to helping detect invalid operations in Rust code, and that is Miri.

Miri

Miri is a tool that lets you find undefined behaviour in Rust programs. It works by acting as an interpreter for Rust’s mid-level intermediate representation, which is used by the compiler internally. In some ways, it is similar to Valgrind, because it works by interpreting this representation. The advantage of using Miri over Valgrind is that this representation retains a lot of semantic information, which means you get much better diagnostic messages. It has the same downside as Valgrind, in that it makes your program’s execution very slow.

Cargo Careful

Cargo Careful

Valgrind

Valgrind lets you run your program in a kind of virtual machine, where all memory access is monitored. It is quite powerful, it even incorporates features such as a model of how CPU caches work so you can check how good the memory locality of your program is. Due to the virtualisation, there is some overhead. It can also report how many instructions your program took to run, which is more useful for microbenchmarks than time, because it is stable between machines (but not architectures).

LLVM Sanitizers

LLVM sanitizers (AddressSanitizer, ThreadSanitizer, UndefinedBehaviorSanitizer, LeakSanitizer): these need to be enabled at compile time and instrument your binary with extra code on every memory access or operation (depending on the kind of sanitizer). The added code adds an overhead, depending on the kind of sanitizer this can be a lot. There are some things these can detect that go beyond what Valgrind can detect.

Address Sanitizer

Memory Sanitizer

Undefined Behaviour Sanitizer

Reading

Data-driven performance optimization with Rust and Miri by Keaton Brandt

Keaton shows you how you can use Miri to get detailed profiling information from Rust programs, visualize them in Chrome developer tools and use this information to optimize your program’s execution time.

Unsafe Rust and Miri by Ralf Jung

In this talk, Ralf explains key concepts around writing unsafe code, such as what “undefined behaviour” and “unsoundness” mean, and explains how to write unsafe code in a systematic way that reduces the chance of getting it wrong.

C++ Safety, in context by Herb Sutter

In this article, Herb Sutter discusses the safety issues C++ has. While this is not directly relevant to Rust, he does make a good point about the fact that there is good tooling to catch a lot of issues (sanitiziers, for example) and that they should be more widely used, even by projects that use languages that are safer by design, such as Rust. While some consider C++ to be defective, with the right tooling a majority of issues can be caught.

The Soundness Pledge by Ralph Levien

Ralph talks about the use of unsafe in Rust. Many developers consider using it to be bad style, but he argues that it is not unsafe that is a problem, it is unsound code that is a problem. As a community, we should strive to eliminate unsound code. This includes using tools like Miri to ensure soundness.