Product

How and why we adopted Rust to develop our EDR

How did we switch from Python to Rust, and why? Beyond optimizing RAM and CPU consumption, how does this development language reduce the risk of errors and improve security?
8 min

Rust: definition 

Rust is a programming language created in 2006 by Graydon Hoare, and supported by Mozilla since 2009.  

It has been adopted by a large community of developers, and by many companies such as Amazon, Google, Dropbox, Meta, Discord or Microsoft… and HarfangLab, of course.  

It’s a low-level language close to C and C++, compiled and limiting the load on the processor. Its creator describes Rust as an answer to the frustrations developers had with C++, and focuses on improving performance, security and the ability to run different tasks in parallel 

So, in practice, how did we get from an agent developed in Python to Rust? Before delving into the details of our adventures, and what we and our users have gained in the process, let’s take a quick look at how our solution works. 

HarfangLab EDR: agent operation and technical prerequisites 

HarfangLab is based on two components:   

  • Agents, a software deployed on a computer system;  
  • Backend, a stack of software layers including a data well to store information, elements to drive the agents, detection assets, and an API to visualize the data.  

The role of the agent is to collect a vast amount of information. It sees everything that happens on an endpoint: process creation, network connections, file access… and it must detect these events to determine whether they are legitimate or malicious, and respond if necessary.  

To do this, based on the information it gathers, the agent decides whether to let a program run, block it or terminate it, quarantine the executable if necessary… and shares what it has detected at the backend. 

EDR - Endpoint Detection and Response - Engines and Rules


This process occurs for every event, which inevitably raises the question of endpoint performance. Indeed, if the agent takes too long to respond, the program is blocked for the end-user, and the system slows down considerably. 

So, as a cybersecurity solution provider, our objective is threefold:  

  • To be lightweight, offering a response in less than 50ms, which implies having all the detection logic on the machine so that there is no latency;   
  • Be invisible, i.e. not interfere with machine operation to preserve the user experience 
  • be able to absorb peak loads, sometimes up to 500k events per minute.
     

Performance and security at the top of the agenda 

At the start of HarfangLab in 2018, we had a time constraint to launch a viable product, with limited staff – nothing very original for a start-up.  

We therefore chose to start in Python, for both agent and backend, knowing that we would have to shift our work to another technology in the medium term.  

During the first two years of our EDR’s existence, the technical team deployed different versions, and made evolutions in response to customer requests… and this stacking of functionalities in the agent eventually reached a limit in terms of performance, especially on less powerful or older machines.   

In addition to Python-related performance issues, we were faced with other limitations: the deployment of new features was complex due to the limitations of Python’s refactoring and code-checking tools. Maintenance was also a pain, not least because of the limited support for obsolete OSes. 

Finally, it’s important to remember that the EDR must run as a service (i.e., on Windows, for example, an .exe file must “talk” to the Windows service manager). What’s more, with Python, the lack of control over the final binary means that it’s not possible to do any hardening to guard against certain attacks – a subject we’ll go into in a little more detail later.  

Having reached this stage in the life of our solution, we compared the various options available to us, based on our prerequisites:   

  • Performance, 
  • Security 
  • Compatibility with all OSes, even obsolete ones,  
  • The fact that the language is well-established to facilitate support. 

Performance being the primary criterion, only a few compiled languages seemed to meet our needs: C/C++, Go and Rust.  

How to choose? Let’s go further into the prerequisites: we needed to write high-level code and interact with low-level system primitives via system APIs, often developed in C, reducing the number of relevant languages.  

Despite the fact that many software and operating systems are developed in C/C++, the very large number of memory corruption vulnerabilities (buffer overflow, use after free, …) that have been exploited for years made us give up.   

In 2019, for example, Microsoft was already reporting that 70% of the vulnerabilities for which they declared a CVE were due to memory corruption, and were recommending Rust for writing safe code.   

In the end, the balance swung between Go and Rust, and we chose Rust for the following reasons:  

  • Better performance (no Garbage Collector, for example),   
  • Better security (compiler guarantees that the code is safe, disappearance of certain classes of vulnerabilities linked to memory corruption, etc.),  
  • Stronger security ecosystem,  
  • Better community dynamic for language development and evolutions.  

To sum up, here are the criteria we evaluated: 

  • C/C++: Performance +++ / Security – / Compatibility ++ / Implantation +++
  • Go: Performance ++ / Security ++ / Compatibility + / Implantation ++
  • Rust: Performance +++ / Security +++ / Compatibility ++ / Implantation ++

Although development in Rust is described as more difficult than in C or Go (the initial step is higher), what the language brought us later on confirmed our choice. 

Python vs. Rust: the match 

We needed to go beyond the limits we’d seen, but also to have a language that was both Memory safe and Thread safe, with low-level access to operating system functionality.  

If you’re familiar with Python, you might be thinking that this language meets all these criteria…  

Python vs. Rust comparison


But we had a few more to fill!  
 

In particular, Python doesn’t allow true multithreading (due to GIL), nor does it allow to control memory allocations – among other things due to Garbage Collector – which leads to greater RAM usage.  

Rust is much more flexible: it lets us choose appropriate memory allocations and data structures, and also features a powerful Type System to facilitate maintenance and check that refactoring is correct.  

In terms of security, hardening is also a priority, as we need to protect the agent against uncommon threats such as DLL hijacking, and an EDR needs to guard against an attacker who has managed to increase his privileges in order to gain administrator rights to a system.  

To meet this need, Rust offers greater security thanks to its Static linking management (whereas Python forces external dependencies), while leaving the door open to Dynamic linking, as well as the possibility of deferring the loading of indirect dependencies (via Windows’ delay loading mechanism, for example). 

Python and Rust pros and cons


Finally, beyond optimizing development, we were obviously thinking of our users, with the ambition of carrying out these various projects with a view to offering them the best possible experience:  
 

  • Improved performance, regardless of machine power,  
  • Maintaining backward compatibility, especially for our customers who depend on the API data format,  
  • Enhanced security (hardening),  
  • Integration of new OSes support (macOS, Linux), 
  • Revised platform design 

In light of our context and objectives, Rust was an obvious choice. 

Python vs. Rust: results and figures 

Without a doubt, there’s a before and an after to Rust: we’ve been able to enrich our functionalities, fine-tune the management of different OSes, and have much more comprehensive documentation.  

We started with the agent and backend in Python, which allowed a lot of code sharing.  

Now, the backend is still in Python, and the agent in Rust. How do we ensure communication between the two?  

We use gRPC to share the communication interface and abstract protocol problems, and Python bindings to use Rust code and improve performance on the backend too. 

When adding Sigma detection rules, for example, they are now validated directly in the backend via the same Sigma engine as the agent. To achieve this, PyO3 enables the same validation code to be shared between the backend and the agent, further boosting performance compared with a validator written in Python.  

What results have we observed since the agent was developed in Rust?  

Firstly, our agent is much more stable, and we can integrate new features much more easily thanks to Rust’s refactoring system and Type System, which allow us to check for errors before deploying the code.  

In concrete terms, the first agent version for a new operating system took us around 3 months to develop, facilitated by a feature base that we were able to pool across all OSes!  

Finally, resource consumption is greatly reduced, with 3 times less RAM and 3 times less CPU used when switching from Python to Rust, for equivalent functionality.     

Based on this evolution, we were able to serenely add numerous additional functionalities to our agent without over-consuming the machine (AI engines, EPP, correlation engines, etc.). 

They testify 

“We were the first to express our interest in Rust, the development language adopted by HarfangLab’s EDR, and to measure its significant gains in the performance of our endpoints.”
Cybersecurity Manager – Industrial Group

Want to know more about all the perks of our EDR ?