How and why we adopted Rust to develop our EDR

📑

Rust: definition

Rust is a programming language created in 2006 by Graydon Hoare, and supported by Mozilla since 2009.

It has been adopted by a large community of developers, and by many companies such as Amazon, Google, Dropbox, Meta, Discord or Microsoft… and HarfangLab, of course.

It’s a low-level language close to C and C++, compiled and limiting the load on the processor. Its creator describes Rust as an answer to the frustrations developers had with C++, and focuses on improving performance, security and the ability to run different tasks in parallel.

In practice, how did we get from an agent developed in Python to Rust? Before delving into the details of our adventures, and what we and our users have gained in the process, let’s take a quick look at how our solution works.

HarfangLab EDR: agent operation and technical prerequisites

HarfangLab cybersecurity platform is based on two components:

Agents, a software deployed on a computer system;
Backend, a stack of software layers including a data well to store information, elements to drive the agents, detection assets, and an API to visualize the data.

The role of the agent is to collect a vast amount of information. It sees everything that happens on an endpoint: process creation, network connections, file access… and it must detect these events to determine whether they are legitimate or malicious, and respond if necessary.

To do this, based on the information it gathers, the agent decides whether to let a program run, block it or terminate it, quarantine the executable if necessary… and shares what it has detected at the backend.

EDR - Endpoint Detection and Response - Engines and Rules

This process occurs for every event, which inevitably raises the question of endpoint performance. Indeed, if the agent takes too long to respond, the program is blocked for the end-user, and the system slows down considerably.

So, as a cybersecurity solution provider, our objective is threefold:

To be lightweight, offering a response in less than 50ms, which implies having all the detection logic on the machine so that there is no latency
Be invisible, i.e. not interfere with machine operation to preserve the user experience
Be able to absorb peak loads, sometimes up to 500k events per minute

Performance and security at the top of the agenda

At the start of HarfangLab in 2018, we had a time constraint to launch a viable product, with limited staff – nothing very original for a start-up.

We therefore chose to start in Python, for both agent and backend, knowing that we would have to shift our work to another technology in the medium term.

During the first two years of our EDR’s existence, the technical team deployed different versions, and made evolutions in response to customer requests… and this stacking of functionalities in the agent eventually reached a limit in terms of performance, especially on less powerful or older machines.

In addition to Python-related performance issues, we were faced with other limitations: the deployment of new features was complex due to the limitations of Python’s refactoring and code-checking tools. Maintenance was also a pain, not least because of the limited support for obsolete OSes.

Finally, it’s important to remember that the EDR must run as a service (i.e., on Windows, for example, an .exe file must “talk” to the Windows service manager). What’s more, with Python, the lack of control over the final binary means that it’s not possible to do any hardening to guard against certain attacks – a subject we’ll go into in a little more detail later.

Having reached this stage in the life of our solution, we compared the various options available to us, based on our prerequisites:

Performance
Security
Compatibility with all OSes, even obsolete ones
The fact that the language is well-established to facilitate support

Performance being the primary criterion for our endpoint protection solution, only a few compiled languages seemed to meet our needs: C/C++, Go and Rust.

How to choose? Let’s go further into the prerequisites: we needed to write high-level code and interact with low-level system primitives via system APIs, often developed in C, reducing the number of relevant languages.

Despite the fact that many software and operating systems are developed in C/C++, the very large number of memory corruption vulnerabilities (buffer overflow, use after free, …) that have been exploited for years made us give up.

In 2019, for example, Microsoft was already reporting that 70% of the vulnerabilities for which they declared a CVE were due to memory corruption, and were recommending Rust for writing safe code.

In the end, the balance swung between Go and Rust, and we chose Rust for the following reasons:

Better performance (no Garbage Collector, for example)
Better security (compiler guarantees that the code is safe, disappearance of certain classes of vulnerabilities linked to memory corruption, etc.)
Stronger security ecosystem
Better community dynamic for language development and evolutions

To sum up, here are the criteria we evaluated:

C/C++: Performance +++ / Security – / Compatibility ++ / Implantation +++
Go: Performance ++ / Security ++ / Compatibility + / Implantation ++
Rust: Performance +++ / Security +++ / Compatibility ++ / Implantation ++

Although development in Rust is described as more difficult than in C or Go (the initial step is higher), what the language brought us later on confirmed our choice.

Python vs. Rust: the match

We needed to go beyond the limits we’d seen, but also to have a language that was both Memory safe and Thread safe, with low-level access to operating system functionality.

If you’re familiar with Python, you might be thinking that this language meets all these criteria…

But we had a few more to fill!

In particular, Python doesn’t allow true multithreading (due to GIL), nor does it allow to control memory allocations – among other things due to Garbage Collector – which leads to greater RAM usage.

Rust is much more flexible: it lets us choose appropriate memory allocations and data structures, and also features a powerful Type System to facilitate maintenance and check that refactoring is correct.

In terms of security, hardening is also a priority, as we need to protect the agent against uncommon threats such as DLL hijacking, and an EDR needs to guard against an attacker who has managed to increase his privileges in order to gain administrator rights to a system.

To meet this need, Rust offers greater security thanks to its Static linking management (whereas Python forces external dependencies), while leaving the door open to Dynamic linking, as well as the possibility of deferring the loading of indirect dependencies (via Windows’ delay loading mechanism, for example).

Finally, beyond optimizing development, we were obviously thinking of our users, with the ambition of carrying out these various projects with a view to offering them the best possible experience:

Improved performance, regardless of machine power,
Maintaining backward compatibility, especially for our customers who depend on the API data format,
Enhanced security (hardening),
Integration of new OSes support (macOS, Linux),
Revised platform design.

In light of our context and objectives, Rust was an obvious choice.

Python vs. Rust: results and figures

Without a doubt, there’s a before and an after to Rust: we’ve been able to enrich our functionalities, fine-tune the management of different OSes, and have much more comprehensive documentation.

We started with the agent and backend in Python, which allowed a lot of code sharing.

Now, the backend is still in Python, and the agent in Rust. How do we ensure communication between the two?

We use gRPC to share the communication interface and abstract protocol problems, and Python bindings to use Rust code and improve performance on the backend too.

When adding Sigma detection rules, for example, they are now validated directly in the backend via the same Sigma engine as the agent. To achieve this, PyO3 enables the same validation code to be shared between the backend and the agent, further boosting performance compared with a validator written in Python.

What results have we observed since the agent was developed in Rust?

Firstly, our agent is much more stable, and we can integrate new features much more easily thanks to Rust’s refactoring system and Type System, which allow us to check for errors before deploying the code.

In concrete terms, the first agent version for a new operating system took us around 3 months to develop, facilitated by a feature base that we were able to pool across all OSes!

Finally, resource consumption is greatly reduced, with 3 times less RAM and 3 times less CPU used when switching from Python to Rust, for equivalent functionality.

Based on this evolution, we were able to serenely add numerous additional functionalities to our agent without over-consuming the machine (AI engines, EPP, correlation engines, etc.).

They testify

“We were the first to express our interest in Rust, the development language adopted by HarfangLab’s EDR, and to measure its significant gains in the performance of our endpoints.”

Cybersecurity Manager – Industrial Group

Want to know more about all the perks of our EDR ?

Ask a demo!

How and why we adopted Rust to develop our EDR

Rust: definition

HarfangLab EDR: agent operation and technical prerequisites

Performance and security at the top of the agenda

Python vs. Rust: the match

Python vs. Rust: results and figures

They testify

Attack Surface Management: Stay one step ahead of threats

DLL Sideloading: solutions to protect your workspace

Performance and detection: concrete applications of AI in cyber

Cybersecurity: telemetry explained