AvionicsCommercialDefenceFeaturesTesting

Memory Safety in Rust

Ben Brosgol at AdaCore

Informally, memory safety in a program means that each data access is well behaved, i.e., is consistent with the item’s data type, does not impinge on any storage locations beyond the data value’s boundaries, and, if the program is multithreaded, does not produce an inconsistent or corrupted value. Memory safety violations result in undefined behavior, a Bad Thing: instances like the notorious “buffer overrun” can turn an otherwise safe and secure program into a ticking virtual time bomb. Some of the most eventful malware attacks in recent years are due to memory safety violations, and the topic has moved from techno-geek subreddits into mainstream discourse. Anyone developing or acquiring software, especially for applications with high assurance requirements, must pay attention to Memory Safety.

When it comes to memory safety, the programming language can be part of the problem or part of the solution.

Memory safety violations can come from a variety of programming language constructs. In this article, we’ll show whether and how Rust meets the challenges (spoiler alert: it does, and the secret sauce is a compiler feature known as the borrow checker).

Storage overlaying

Allowing the same bit pattern to be interpreted as different types (e.g., an integer and a pointer) is a fundamental insecurity. High-level languages provide compile-time type checking as a defense but also recognize the practical need for a feature that offers storage overlaying functionality. C does this through unions, leaving it up to the programmer to avoid misuse. In other languages, a variant record mechanism supports storage overlaying but with a run-time check to ensure that field access is consistent with the variant established when the value was constructed.

Rust’s enum type mechanism takes the latter approach. An enum value is represented by a special field (the “tag”) reflecting which variant is present,  together with sufficient storage to hold the largest variant. A “match” or “if let” statement, in effect, performs the check for accessing the variant.

Here’s an example: an enum with two variants: a Ptr holding a pointer to an i32 value (Box<i32>) or an Int containing an integer (an i32):

    enum E {

        Ptr(Box<i32>),

        Int(i32),

    }

    let mut e: E;

    e = E::Int(100);

    match e {

        E::Ptr(p) => println!(“{}”, *p),

        E::Int(n) => println!(“{n}”),

    … // other code

    }

Since a reference to data in one of the variants requires syntax identifying the variant (as shown in the match statement), there is no way to interpret the value of one variant as though it were of the type for data in another variant. If it were possible to view the contents of e as a Ptr even though it had been created as an Int, memory safety would be sacrificed (e.g., dereferencing the pointer might provide access to restricted data). It would be likewise problematic if a Ptr could be interpreted as an Int (e.g., modifying the pointer value through arithmetic operators and then reinterpreting the result as a pointer). Rust prevents such insecurities; enums are memory-safe.  (Mutating an enum within the match statement does not open a loophole: the value of a variant’s data (p or n in the example) is bound when the match statement is executed and is still valid, even if made obsolete by the mutation.)

Data initialization

Reading the value of a variable before the variable has been initialized is a fundamental vulnerability, with potential effects ranging from erroneous computations to memory safety violations. Programming language solutions include ignoring the problem (“trust the programmer, or hope for an effective static analysis tool”), providing default initial values for certain types (e.g., all scalar data are initialized to zero), enforcing initialization with run-time checks, and enforcing initialization with compile-time checks.

With the joint goals of safety and efficiency, Rust’s choice will not be surprising: compile-time checks ensure that all values are initialized before being referenced. Rust does not incur the inefficiency of run-time checks, and it avoids the problem that reading a default-initialized variable may be an oversight (bug) rather than the programmer’s intent.

Rust’s rules are conservative; for example, a compile-time check will flag the selection of an array or vector element unless the entire vector or array is known to be initialized. On the other hand, the rules are also flexible enough to reduce false positives, using control-flow analysis to detect initializations performed on the arms of conditional statements.  Here’s an example:

fn foo(b: bool) {

    let x: String; // uninitialized

    if b {

        x = “True”.to_string();

    } else {

        x = “False”.to_string();

    }

    println!(“{x}”); // Legal, x is known to be initialized

}

Referencing uninitialized values is a dangerous vulnerability, but Rust provides a safe, efficient, and practical solution.

Indexing

Indexing an array entails a run-time check, which ensures that the index is within the bounds of the array, and likewise for vectors. Violating the check results in what Rust calls a “panic”: termination of the thread in which the index check failed (with unwinding of the call stack). If the check failure occurred within the main() function’s thread of control, then the entire program is terminated (including all spawned threads).

Index checks apply to slices; a slice bound resulting in an index outside the containing array or vector will trigger a panic. Indexing a slice will check that the index selects an element within the slice (considering that slices are always indexed starting at 0, regardless of the slice bounds).

Thus, although an index check failure is likely a program bug, it does not compromise memory safety. Storage locations outside the array or vector are not accessed, which means the confidentiality and integrity of memory locations outside the array or vector are preserved. As will be explained below, the semantics of panic ensure that no data structures are left in an inconsistent state.

Integer overflow and wraparound

Although not directly violating memory safety, integer overflow[1] is a well-known weakness (CWE-190 in The MITRE Corporation’s Common Weakness Enumeration). It can lead to vulnerabilities when the program logic expects a correct numeric result but gets a wraparound value instead. Rust addresses this issue in several ways:

  • In debug compilation mode, overflow for each integer type is detected at run time and produces a panic. In release mode, the effect is wraparound. The idea is that the developer will verify the code sufficiently in debug mode to provide confidence that no overflows occur so release mode can produce more efficient code.
  • Each integer type provides functions that allow the program to explicitly control overflow behavior. The following functions are defined for addition; analogous functions are available for the other integer operations:
    • wrapping_add(x, y) always wraps
    • saturating_add(x, y)  produces the maximum (or minimum) value for the type
    • overflowing_add(x, y) returns the value (possibly a wraparound) and a bool indicating whether overflow occurred
    • checked_add(x, y) returns an Option<T> for the type, with None reflecting an overflow

Pointers

Pointers are arguably the language feature most susceptible to memory safety violations. Potential issues include storage overlaying (e.g., casting a void* in C), dangling references (freeing storage while it is still referenceable), dereferencing a “null” or uninitialized pointer, and freeing the same storage more than once. Although not strictly a memory safety issue, storage leakage (heap exhaustion) / fragmentation is a potential pitfall when a program uses dynamic allocation.

Rust addresses these issues without the need for garbage collection. In brief, Rust supports two kinds of pointers:

  • “Safe” pointers (not an official Rust term), which provide compile-time guarantees of memory safety but come with usage restrictions and 
  • Raw pointers, which provide C-like flexibility but sacrifice memory safety guarantees.

In the remainder of this section, we’ll explain how Rust’s safe pointers meet their memory safety goal. Unless indicated otherwise, for ease of exposition, the term “pointer” will be used to mean “safe pointer.” Raw pointers are described in the unsafe code section.

  • Each pointer value has a well-defined type, and Rust’s type checks prevent an analog to C’s casting a void* value.
  • Rust prevents accessing a dropped value. Although the programmer can implement the drop() method for a type to perform specific processing (for example, releasing resources that are managed manually) and can call the method explicitly, the borrow checker performs a conservative (sound) analysis to ensure that there are no subsequent accesses to the dropped value.  This eliminates one potential source of dangling references and “use after free” errors, and it prevents “double frees.”
  • Dereferencing the null value is not possible since pointers do not have a null value. In cases where the program needs to process a value that is either a valid pointer or else the analog of null, the enum Option<T> can be used, where T is the pointer type. The value will either be Some<T> or None, and Rust’s type checking prevents None (an Option<T> value) from being used as a T.
  • The compiler performs a conservative analysis to check that a pointer is initialized before being dereferenced. It is not possible to dereference an uninitialized pointer.
  • Rust makes memory management safe through a combination of ownership rules, reference types, and “smart” pointer types (such as the result of Box::new()).
    • By default, an allocated value has a unique owner. Assignment (as well as parameter passing and field initialization) transfers ownership of the value to the target pointer and treats the source pointer as uninitialized.
    • A value can be borrowed through the use of a reference type. This is often done in parameter declarations: the formal parameter gains temporary ownership of the referenced value during the function’s execution, and ownership is returned to the variable passed as an argument when the function returns.
    • If shared ownership is needed, the programmer can use reference counts (the Rc type or Arc if the value is being passed between threads), but with the proviso the access is immutable. This restriction ensures that the same value is not simultaneously accessed mutably and immutably.
  • Rust supports shared non-owning pointers to both declared and allocated values. A compile-time analysis (the “borrow checker”) enforces two memory safety principles:
    • Data consistency / no problematic aliasing.  A reference to a value v may either be shared (denoted by &v, which allows reading from but not writing to v) or mutable (denoted by &mut v, which allows both reading from and writing to v). However, Rust prevents writing through a mutable reference while any other references to the same value are active. This exclusivity restriction is critical for ensuring that values shared across threads do not get corrupted (the principle is known as “Concurrent Read, Exclusive Write). Rust also enforces the exclusivity restriction in sequential programs, and in so doing it eliminates potential aliasing issues and facilitates code optimization.
    • No dangling references. A reference is not allowed to outlive its referent. Examples such as the following are illegal:

let refx: &i32;

{

    let x = 100;

    refx = &x; // OK

    println!(“{}”, *refx); // OK

}

println!(“{}”, *refx); // Dangling reference

The lifetime of a variable extends only to its last usage (direct or indirect), and not necessarily to the end of the scope in which the variable is declared. In the absence of the second println!() statement, the code would be legal: the lifetime of refx would not extend beyond the lifetime of the local variable x.

  • Rust’s lifetime checks apply to function parameters and results. To prevent a dangling reference, the value returned by a function must be sufficiently long-lived; e.g., it can’t be a reference to one of the function’s local variables. Further, the legality of a function invocation needs to be determined only by the function’s header and the lifetime properties of the arguments (information available at the point of call) and not depend on the function’s implementation. These principles shaped the Rust solution:
  • If a function takes one or more reference parameters and returns a reference result, then the function’s public interface needs to specify how the result’s lifetime relates to the lifetimes of the parameters.

For example, given the absence of lifetime annotations on the formal parameters, the implementation of the following function would have to support invocations with arguments that are references to values of arbitrary (e.g., local) lifetimes:

fn foo(x: &i32, y: &i32) -> &i32 {

    if *x + *y > 0 {

        x

    } else {

        &100

    }

}

However, since foo() can return x, a dangling reference would be produced by an invocation passing a reference to a local variable as the first argument, but where the result is assigned to a more global variable. Not looking into the implementation of foo(), the borrow checker would be unable to detect the error. The above version is, therefore, illegal.

To address this issue, the Rust compiler needs to know the relationship between the lifetimes of formal reference parameters and the returned reference result. A program can meet this requirement by making the function generic with respect to lifetime and by adding relevant lifetime annotations to the formal parameters and result. This is unlike anything you’ll find in other languages, so the notation will likely seem a bit foreign:

fn foo<‘a>(x: &’a i32, y: &i32) -> &’a i32 {

    if *x + *y < 0 {

        x

    } else {

        &100

    }

}

The tick mark “’” is used for a lifetime parameter to a generic; by convention, these parameters have short names ‘a, ‘b, ‘c, etc. The annotations in foo() have the following semantics:

  • The same lifetime specified for x and the result means that the result must have a lifetime no shorter than that for x
  • The absence of a lifetime annotation on y means that y can be of arbitrary lifetime, unrelated to the lifetime of x.

The restrictions on the uses of the parameters are checked within the body of the function. For example, returning y would be illegal since its lifetime is unknown.

The generic function is implicitly instantiated at each invocation. Any use of the result that produces a dangling reference will be detected by the borrow checker, which can enforce the constraint that the result lifetime must be no shorter than that of the first argument.

Rust also defines a special lifetime, ‘static, for references to values that live throughout the entire execution of the program (string literals, or values declared as static). For example, with the following header, fum() can only be invoked with a reference to an i32 value with static lifetime:

fn fum(x: &’static i32)

  • Rust’s lifetime checks also apply to references stored within data structures. For example, the following struct definition is rejected:

struct S_unsafe {

    r_unsafe: &i32,

}

Rust complains that it expects a named lifetime parameter for the reference field r_unsafe, since it needs to ensure that, for any struct value v, the value referenced by field v.r_unsafe lives at least as long as v.

The struct definition can be corrected by making it generic with respect to lifetime:

struct S_safe<‘a> {

    r_safe: &’a i32,

}

This constrains each S_safe instance to contain only references to values that live at least as long as the instance itself. Alternatively, an explicit lifetime may be specified for the field:

struct S_also_safe {

    r_also_safe: &’static i32,

}

  • Rust’s ownership and lifetime rules enable a simple automatic reclamation policy. For the default case (i.e., if reference counts are not used), the strategy can be summarized as follows.
  • Pointer assignment ptr2=ptr1 brings a change of ownership (assuming that the two variables do not point to the same value). The value that ptr2 points to is deallocated (“dropped”, in Rust parlance), the value of ptr1 is copied into ptr2, and ptr1 is then treated as uninitialized.
  • When a stack frame is popped, all heap values owned by local pointer variables are automatically dropped. (Recall that we are excluding raw pointers from this description.) Dropping a heap value that itself owns a heap value involves dropping the latter value, so leaving the scope of a locally declared pointer entails reclaiming the entire tree of heap values for which it is the root. (That raises potential issues in a program with real-time requirements since it complicates the computation of worst-case execution time, but that’s a subject for another blog.)

The above approach is generalized in the traditional way when reference counts are used (e.g., dropping a value when its reference count goes to zero).

Rust’s approach to automatic reclamation is not sufficient to prevent storage leakage. For example, a reference-counted data structure with cycles will not be freed, and, more directly, an infinite loop in which an element is appended onto a vector will eventually exhaust the heap (even automatic garbage collection would not help in the latter case). Static analysis may be useful in detecting such situations.

Functions and closures

Functions and closures are data values, and that raises a potential issue. If a function or closure foo() can reference a stack value from an outer scope, then assigning foo to a variable declared at a more global level than the referenced value and then subsequently invoking foo(), will produce a dangling reference. Rust avoids this problem in two ways:

  • Functions are not permitted to access stack values from outer scopes.
  • Closures are permitted to make such accesses, but they do so either by borrowing a reference to the outer value (either mutable or immutable) or else by moving ownership of the value into the closure. In the first case, a compile-time check will prevent the closure from being invoked in a scope that is more global than the borrowed value. In the second case, the transfer of ownership to the closure prevents the value from being dropped while the closure is still live.
  •  

Concurrency

A concurrent program (with either actual or simulated parallelism) presents several potential opportunities for violating memory safety:

  • Dangling reference: a thread’s lifetime exceeds that of a data value that it is accessing.
  • Data race / unprotected access. One thread is writing to a shared data value, while another thread is either reading from or writing to that data, leading to an inconsistent view (partially updated) or a corrupted value (simultaneously updated).
  • Data corruption/ aborted update. A thread terminates (e.g., because a run-time check fails) while it is updating a non-local data value, leaving the value in an inconsistent state.

Rust’s concurrency features prevent dangling references and either prevent data races and data corruption or else provide syntax that makes it evident when the language-provided checks might not be sufficient.

  • Restrictions on references from threads

Creating a thread entails invoking the thread::spawn() function with an argument that is either a function or, more commonly, a closure. The new thread executes the argument’s code and continues to run (unless joined) even after the spawning scope has completed. Since Rust prohibits a function from making “up-level” references to local (stack) variables in outer scopes, a thread executing a function will not encounter any dangling references.

A closure can capture the values of local variables in enclosing scopes, but if it attempts to capture such values by reference (borrowing), the compiler will complain. A ‘static lifetime bound constrains the argument to spawn(), so the closure will need to capture its environment through a move. This transfers ownership (for non-Copy types) into the closure, with the original values considered uninitialized. Again, dangling references are prevented. Other rules ensure that the closure is only invoked once, avoiding the memory safety error of attempting to move an already moved (and thus uninitialized) value. 

If a borrowed reference to non-local data is needed, the thread::scope() function can be used, which creates a virtual scope in which so-called scoped threads can be spawned. A scoped thread can borrow references from outer scopes, but an implicit join at the end of the virtual scope ensures that the thread does not outlive any references that it borrows.

  • Restrictions on static variables

Immutable static variables are thread-safe, but mutable static variables present potential issues:

  • If a thread (more strictly, the function or closure executed by a thread) assigns to the static variable a reference to a value local to the function or closure, this will be a dangling reference after the thread terminates.
    • If one thread writes to the static variable while another thread is either reading from or writing to the variable, the effect is a data race.

Standard lifetime checking will prevent the first problem. Rust addresses the second problem by requiring each access to a mutable static variable to be within a block that is marked as unsafe. It is then the programmer’s responsibility to ensure that each write access to the variable is exclusive (no simultaneous reads or writes).

  • Mutexes

The classical mutex construct has a dodgy reputation, and for good reason. In its typical incarnation, a mutex is separate from the data value that it is protecting, opening the door to various programming errors:

  • Failing to lock the mutex or releasing a lock on a mutex that the thread is not holding can lead to a data race.
    • Failing to unlock a mutex will deadlock any thread that is either queued on the lock or that subsequently attempts to lock the mutex.

Further, mutexes have a nasty interaction with exception handling. If a thread has a run-time error while holding a lock, one of two grim consequences will result, depending on the language semantics:

  • If the lock is released as part of the stack unwinding, the data value being protected may be in an inconsistent (partially updated) state.
    • If the lock is not released, then other threads attempting to lock the mutex will deadlock.

That’s how traditional mutexes work, but Rust is not traditional, and indeed, it avoids all the problems listed:

  • In Rust, a mutex serves as a wrapper for the data value it is protecting, and any access to the value must go through the mutex. A thread can only access the value after it has successfully locked the mutex.
    • Unlocking is automatic, at the end of the scope where the mutex was locked. There is no way to forget to unlock a mutex.
    • If a thread panics while holding a mutex, the mutex is considered to be “poisoned.” Attempting to lock a poisoned mutex returns an error value. There is no way to access the possibly inconsistent data value, and the thread attempting to lock the mutex can implement the appropriate antidotal logic to counteract the failure.

Mutexes are generally not the best way to deal with shared data. They don’t scale up, they often need to be used in conjunction with condition variables (which have their own methodological issues), and they may cause a deadlock if two threads attempt to lock different mutexes in different orders. But if you need to use them, Rust mutexes preserve memory safety.

  • Channels

Rust’s channel facility supplies a “multiple producer single consumer” (MPSC) model for inter-thread communication; multiple producing threads can send typed data through a channel, and a consumer thread can receive the data, with control over whether or not to block if no data is present. The semantics of channels (constraints on the traits implemented by the data type, and the change of ownership when the value is transmitted and received) prevent dangling references and data races.

  • Atomic types

Some data accesses are atomic (i.e., indivisible) at the hardware level; i.e., the machine instruction will run to completion once it starts, without interruption. Whether or not an access is atomic is intrinsically hardware specific: for example, the fetch of a 64-bit integer may be atomic on some platforms but comprise two separate 32-bit accesses on others.

Atomicity provides efficient and memory-safe mutual exclusion guarantees for the accessed value, and Rust offers a number of types and associated methods in support. For example, the type std::sync::atomic::AtomicBool defines load() and store() methods that allow simple synchronization between threads, through a shared value that is polled/set by the threads. Of course, owing to the clever designers of optimizing compilers, things are not as simple as one might hope, and the various methods on atomic types take a parameter that reflects the required memory ordering (the guarantees that may be assumed with respect to the ordering of operations). But that’s the subject for a future blog.  Or if you’re impatient, just check out the discussion in Chapter 3 of Mara Bos’s excellent book Rust Atomics and Locks – Low-Level Concurrency in Practice.

  • Asynchronous programming

Threads are all well and good for managing tens or even hundreds of concurrent activities, but if the number is in the thousands or tens of thousands or more, then the overhead (stack space, context switching) can be prohibitive. In addition, tying up resources for threads that spend most of their time suspended while waiting for system calls to complete is wasteful. Rust’s addresses these issues with two main features:

  • Asynchronous tasks, a time- and space-efficient alternative to threads. This construct supports handling extremely large numbers of independent jobs farmed out to a single thread or a pool of worker threads.
    •  Asynchronous functions, a coroutine-like feature that allows a task to initiate an operation and then continue with other processing without waiting for the operation to complete. An asynchronous function returns a value that implements the Future trait; a task can poll the Future to check for completion.

Asynchrony raises the standard memory safety questions. Can a task and an asynchronous function interfere with each other when accessing shared data? Can the function outlive a data value that it is processing? Again, Rust’s ownership, borrowing, and lifetime rules prevent data races and dangling references.

Admittedly, asynchrony also introduces some new wrinkles. Borrowed values are not allowed to be moved, but the borrow checker does not have visibility into futures of async functions. Rust is not fooled, however: the Pin wrapper type prevents a pointer from being moved and thus avoids the potential dangling reference.  

The great escape: unsafe code

All of the above notwithstanding, sometimes rules have to be broken:

  • Rust is targeted to embedded applications, where developers need the freedom to get down and dirty with the hardware at the expense of assured memory safety.
  • Programming idioms like doubly linked lists are not expressible within the confines of the language’s ownership/borrowing scheme.
  • The rules are sometimes overly conservative, and safe code may be rejected.

The Rust solution? Offer the needed flexibility but require constructs whose safety cannot be verified by language checks to be encapsulated within explicitly marked syntactic contexts: either unsafe blocks or unsafe functions. Inside such regions, and only there,  you can use features such as raw (C-style) pointers without any protection against dangling references; null dereferences, double free, etc., or access/modify mutable static values with no guarantees against data races. Other Rust features, however, are still fully checked within unsafe regions.

This flexibility allows defining immutable data structures with interior mutability (such as the standard Rc and Arc types) as well as classical data structures like doubly linked lists.  Rust’s presumption is that unsafe blocks and functions will be thoroughly vetted as memory-safe via standard software verification practices (testing, static analysis) and can thus be safely included in or invoked from regular Rust code.

The verdict

Life in general, and programming languages in particular, do not offer free lunches. For Rust, memory safety costs come in several areas. One is language complexity. Rust is a large language (generics, crates, traits, concurrency, iterators, …), the lifetime concept with all its subtleties (which stays submerged in the depths of the semantics for most programming languages) has bubbled up to the surface syntax of Rust and cannot be ignored, and the whole pointer/reference/borrow-checker machinery can be headache-inducing. Code that looks correct may fail to compile (sometimes with a puzzling diagnostic), and in the other direction, code that at first glance seems to violate a borrowing rule may be safe and perfectly legal.

A related issue is programmer ramp-up effort. The Rust approach to pointers is novel, and mastering the ownership/borrowing rules (including when to use String versus &str) will likely take a concerted effort. Common data structures may require a rethink, and programmers will need to understand concepts like interior mutability that simply do not arise in other languages.

Although Rust does not provide a free lunch, its memory safety benefits – and concomitant reduced verification costs – justify the investment.  Data races, dangling references, and related unpleasantries will not get past the watchful eye of the borrow checker. Moreover, the rules do not compromise the performance of the compiled code or require the complex support of a garbage collector. With both memory safety and run-time efficiency, Rust is a sensible choice for high-assurance embedded systems, including applications that need to be certified under formal safety standards.


Passenger Airliner overhead about to land