RUST

How to write idiomatically in RUST magically fixed my bugs

14 min

When using compiled languages, code cannot be run if it does not pass the compilation step, and for this reason, the compiler sometimes gets in your way.
Sometimes, the compiler refuses the quick-and-dirty change you made to test an idea.
Sometimes, it refuses actual changes until you fix them after you figure out the reason.
Depending on the compiler and its error messages, this step is more or less straightforward. C++ developers working with templates can know this can get very tricky. Any Rust developer remembers having hard times trying to fight the borrow checker (which is luckily much more helpful than g++ on its error messages).

This can get irritating, but on the other hand, compilers barring incorrect code are pretty valuable.
As a simple example, Python’s  local variable referenced before assignment can be very frustrating, especially in case you’re trying something complex and this exception happens many seconds after your test has begun. This is a typical error a compiler would detect even before you’re trying to run your code.
But, more important, compilers can detect more subtle errors that are not visible at first sight, or make your code fool-proof and future-proof by preventing future misuse (either by you, your colleagues, or your users) of the functions and API you’re writing.

To this extent, I like to see the compiler as a mentor and a friend, rather than just as an-annoying-software-which-never-understands-what-I-want. But for the compiler to help you as much as it can, it sometimes needs hints from the developer. In Rust, writing the most idiomatic way can help you reach this goal.
In this article, I will discuss two examples where writing idiomatic Rust helped the compiler to help me to fix problems. With two simple examples in the code of the ferrisetw crate, I’ll demonstrate how the compiler managed to detect tricky programming errors and prevent API misuse.

Rust: introducing ETW and the ferrisetw crate

I recently had to work with ETW (Event Tracing for Windows), which is a Microsoft framework for providing and subscribing to tracing events. These events are provider-defined and can be anything, but the common use case is to feed this framework with performance-monitoring data.

Microsoft provides a C API to interact with ETW, as well as higher-level APIs such as Microsoft.Diagnostics.Tracing.TraceEvent library in .NET, krabsetw library in C++, and more recently the ferrisetw crate in Rust.

I was very fortunate to be able to use the ferrisetw crate. It describes itself as “basically a rip off KrabsETW written in Rust”. As such, most of the way its structs and modules are designed are similar to this C++ counterpart. It is a simple crate that works very well, thanks to which I could quickly work on ETW and include it in my Rust projects. I am very grateful to n4r1b for open-sourcing his work.
However, after a few weeks spent investigating ETW and ferrisetw, I noticed this crate had some issues. One of them is a potential misuse of the API, and another one is a thread-safety issue. Both of them come, to some extent, to the fact that it follows the C++ architecture. By changing this architecture to follow Rust idioms and guidelines, fixing them was rather easy. These fixes are discussed in the following two chapters.

How to make the API hard(er) to misuse? An example of the Builder pattern

Microsoft’s API offers many (if not too many) ways to interact with traces. After a few weeks of working with ETW, I still couldn’t clearly explain how to call all its components in the right order… and it seems I’m not the only one.
Microsoft offers several capabilities around a trace: starting it, opening it (i.e., “subscribing” with a callback to a started trace, although Microsoft hasn’t clearly explained the difference with starting it), processing it (what everyone would have called “starting” it), adding or removing providers/providers (Microsoft uses the term “enabling”), changing its parameters, such as filters, names, buffer sizes, etc. Theoretically, it’s possible to do all of this with ETW. In theory, it’s possible to do a lot of things, but some options are not well documented, or even not very useful (is it possible to change the name of a trace after stopping it? What happens if the same trace is processed twice? etc.).

Ferrisetw is a Rust crate (library) that wraps this C API into a higher-level (“object-oriented” as we do not call them not in Rust). At its core, there is a struct UserTrace which abstracts an ETW trace session.

In its version 0.1this is how ferrisetw defines struct UserTrace :

/// Actual documentation of this struct at <https://web.archive.org/web/20221230164234/https://n4r1b.com/doc/ferrisetw/trace/struct.usertrace>
struct UserTrace {
    name: String,
    ...
}

// The actual UserTrace is more complex, and some of these methods are part of traits, but you get the idea
impl UserTrace {
    pub fn set_trace_name(&mut self, name: &str) { ... }
    /// Subscribe to a Provider. Microsoft calls this "enable"
    pub fn enable(self, provider: Provider) -> Self { ... }
    pub fn start(self) -> Result<Self, TraceError> { ... }
    pub fn process(self) -> Result<Self, TraceError> { ... }
    ...
}

Usually, one would want to create a UserTracenstance, set its name, subscribe to Providers, then start and process the trace, and ferrisetw’s documentation has code examples that call these functions in this “sensible” order.
However, nothing in the code prevents the user from starting a trace, then change its name, then subscribe to a Provider.
The Microsoft API may(?) make these possible, but it turns out ferrisetw was only designed to do it the “sensible” way and calling set_trace_name or enable after the trace is started would actually do nothing.
Since nothing in ferrisetw’s documentation explicitly says the “sensible” order is mandatory, this could puzzle users wondering why their calls to set_trace_name or enable have no effect.

I’ve improved this in commits that I published in version 1.0 of ferrisetw. I could have added documentation, but instead, why not ask the compiler to guarantee that we’re calling functions in the right order?
To do this, I’ve introduced a more idiomatic Rust Builder pattern.

Ferrisetw 1.0 now features an additional Builder struct:

/// The actual Builder is generic over the trace kind, but for the sake of simplicity, we'll only keep the relevant parts for this article
#[derive(Default)]
struct UserTraceBuilder {
    name: String,
    providers_to_use: Vec<Provider>,
    ...
}

impl UserTraceBuilder {
    pub fn named(mut self, name: String) -> Self {
        self.name = name;
        self
    }

    pub fn enable(mut self, provider: Provider) -> Self {
        self.providers_to_use.push(provider);
        self
    }

    /// This calls both "start" and "process"
    pub fn start(self) -> Result<UserTrace, TraceError> {
        ...
    }
}

struct UserTrace {
    name: String,
    ...
}

impl UserTrace {
    /// This generates a Builder
    pub fn new() -> UserTraceBuilder {
        UserTraceBulder::default()
    }

    // This struct now has only read-only getters
    pub fn name(&self) -> &str {
        &self.name
    }
}

fn main() {
    // This Builder can be used this way
    let trace = UserTrace::new()
        .named("my awesome trace")
        .enable(provider1)
        .enable(provider2)
        .start()
        .unwrap();

    println!("Started {}", trace.name());
}

This pattern ensures there is no way to even try to change the name of the trace (or the providers it is subscribed to) after it has been started.

This is a good example of how to make the compiler enforce proper API usage.
This Builder model is not specific to Rust, although the lack of overloaded constructors in this language makes it very relevant. In C++, the same result could be achieved by having constructors that constantly define data members. This is more or less what krabsetw does here, at least for the member name.

Ferrisetw: more examples that make an API harder to misuse

Ferrisetw 1.0 has more changes that make its public API harder to misuse.

For example, I upstreamed commits that introduce this Builder pattern for struct Provideras well. This is what ferrisetw 0.1 looked like:

pub struct Provider {
    pub guid: Option<GUID>,
    ...
}

// Somewhere else, in a function that handles ETW events
pub fn on_event(...) {
    ...
    let provider = ...;
    let guid = provider.guid.unwrap(); // Must have been set already
    ...
}

The new Builder pattern now enforces the GUID is correctly set before starting a trace. As a consequence, guid has gone from a Option<GUID> into a bare GUIDand this removes the need for this scary .unwrap() !

Making sure the visibility of functions is correctly set to pub or not pub is another way to have the compiler enforce the users are not misusing your API. This is also something I tried to be cautious with when refactoring ferrisetw.
This is not specific to Rust either, C and C++ also make this difference. Python does to some extent, for function names that start with one or two underscores.

How to make the crate (more) thread-safe?

Another issue I faced when experimenting with ferrisetw was related to thread-safety. We’re discussing a Rust program here. So, you may wonder why thread-safety is even a concern since the Rust compiler offers guarantees of fearless concurrency.
However, ferrisetw is interfacing with a C API. Because, unlike Rust, C is a language that does not keep track of lifetimes and ownership, the Rust compiler is unable to do so to data passed to and from C.
So, as with every FFI (Foreign Function Interface), we have to use the scary Rust unsafekeyword. This creates scopes in the code where we can call unsafe functions. These are parts where the compiler lacks knowledge to ensure its usual memory management guarantees, and the developer has to check the memory safety himself (…or fail to). In ferrisetw 0.1, there was a slight mistake in this respect, which I will describe in this chapter.

To receive ETW events, the Microsoft API provides a function (OpenTraceW) to register a callback. This callback is invoked by Windows each time a new event is available and takes a single argument: a EVENT_RECORD* etw_eventwhich contains a pointer void* UserContext to arbitrary data that is passed when registering to the callback.
It is possible to register a Rust function for this callback:

// In this example code, safety comments are ignored for clarity. Actual code is more complex
use windows::Win32::System::Diagnostics::Etw::OpenTraceW;
use windows::Win32::System::Diagnostics::Etw::EVENT_TRACE_LOGFILEW;

fn register_callback(my_context: *mut std::ffi::c_void) {
    let log_file = EVENT_TRACE_LOGFILEW {
            LoggerName: ...,
            Anonymous2: EVENT_TRACE_LOGFILEW_1 {
                EventRecordCallback: Some(trace_callback_thunk)
            },
            Context: my_context,
            ..Default::default()
        };

    unsafe{ OpenTraceW(&mut log_file as *mut _); }
}

// the `extern "system"` keywords make it possible for this function to be called from C
unsafe extern "system" fn trace_callback_thunk(p_record: *mut Etw::EVENT_RECORD) {
    println!("Got event {:?}", *p_record);
    println!("It points to a user context at address {:x?}", (*p_record).UserContext);
}

Of course, having a simple void* (or as Rust calls it, a std::ffi::c_void) as a user context is not very useful.
We would like to make it point to an actual Rust struct, such as:

struct CallbackData {
    /// How many events have been processed so far
    events_handled: usize,
    /// Other things, such as a list of Rust closures we want to execute on each incoming event
    ...
}

And in trace_callback_thunkthe original ferrisetw 0.1 recovers it as follows:

// Actually, CallbackData was called TraceData in ferrisetw 0.1. I changed it in this snippet for more consistency with the rest of the article
unsafe extern "system" fn trace_callback_thunk(event_record: *mut Etw::EVENT_RECORD) {
    let ctx: &mut CallbackData = unsafe_get_callback_ctx((*event_record).UserContext);
    println!("Got event {:?}", *p_record);
    println!("It points to this user context: {:?}", ctx);
    ctx.events_handled += 1;
}

pub unsafe fn unsafe_get_callback_ctx<'a>(ctx: *mut std::ffi::c_void) -> &'a mut CallbackData {
    &mut *(ctx as *mut CallbackData)
}

This has important problems though.
First of all, we do nothing to check that TraceData has not been cancelled when we execute the callback and event_record.UserContext could be dangling. I have fixed this in ferrisetw 1.0.
A more interesting problem is that we cast a C void* (with no lifetime or property) into a &mut reference, without giving the compiler any information about its actual lifetime or any aliases.
This means that several threads receiving different ETW events linked to the same void* UserContext will end up sharing a mutable reference to the same data at the same time! 😱 Disaster in prospect, as both threads could end up fighting to update the events_handled member.
Sharing a mutable reference multiple times is exactly one of the things the Rust compiler forbids in order to guarantee memory safety. (Safe Rust code would make this impossible; this problem is only made possible by our [incorrect] use of unsafe).

I improved this in version 1.0.
I couldn’t get rid of the unsafe blocks, because we’re dereferencing a raw pointer, which can’t be done in safe Rust. However, I was able to get rid of this &mut. Instead of unsafe_get_callback_ctx returning a mutable reference, we return a read-only reference. This tells the compiler that this reference may be shared with other users.

Now the compiler is aware of this, let it guide us! Let’s consider it our friend, and have it point at what is wrong in our code.
If you build this fixed version, it will complain, because we’re trying to modify ctx.events_handled although ctx is not mutable.
This totally makes sense, but we actually want to mutate the content of ctx. This is the typical use case for interior mutability which could be achieved by several means. Turning the CallbackData into a Mutex<CallbackData> would be one of them, but Mutexes are relatively costly. It turns out replacing events_handled: usize with events_handled: AtomicUSize is another solution, it is less costly, and makes the compiler happy.

In the actual ferrisetw crate, there are more members in the CallbackData struct, and the compiler complained about some of them. Step by step, it warned us about unexpected mutations, and we could fix each of them, by adding interior mutability at specific locations only, so as not to hinder the overall performance of event processing.

Data races, performance … What’s next

The actual ferrisetw crate is more complex. We won’t discuss it in detail in this article, but for instance, it now uses an Arc<CallbackData> instead of a plain CallbackData to avoid possible race conditions described in  https://github.com/n4r1b/ferrisetw/issues/45.
This race was not found by the compiler, due to the use of unsafe. But such data races would not have been possible in pure Rust code.

Also, much work has been done to improve performance (by making ferrisetw zero-copy and adding caches), this is listed here and could be the topic of a future blog post.

Conclusion: light on the strengths of Rust and its compiler

Writing simple, straightforward, and hard-to-misuse/future-proof APIs is something I like with compiled languages, and Rust really shines at it.
The examples discussed in this article are definitely not ground-breaking. But they do illustrate the strengths of Rust and its compiler. Solutions discussed here, such as using the Builder pattern, or refraining from using &mut references without a solid reason to, are elegant ways to offload to the compiler the tedious work of ensuring guarantees in your code without having to manually check this.

In other languages that do not offer as many guarantees, this would not be possible, and sometimes only a careful scrutiny of the code could achieve the same results, for one specific version of the code (but without any guarantee that future commits won’t break them without anyone noticing).