“We don’t see you in MITRE.”
“Your solution hasn’t even been benchmarked, so how can anyone know what you’re REALLY worth?”
“Anyway, it’s impossible for a French player to be as good as the Americans…”
… okay, okay, that’s enough …
On August 1st, 2022, a high-stakes meeting convened at HarfangLab to deliberate on these crucial questions: should we yield to the allure of fame and partake in the MITRE ENGENUITY 2023 evaluation? Could this journey lead to our downfall?
The decision-making process
To make the right decision, we had to pinpoint:
– Our clients: for those who place unwavering faith in MITRE or the Magic Quadrant, and for enterprises, regardless of their size, seeking to evaluate our detection capabilities with irrefutable evidence, the MITRE Evaluation stands as the ultimate confidence indicator;
– Our partners: they yearn to persuade their clients of our trustworthiness, substantiated by robust results;
– Our international ambitions: our global ambitions hinged on our ability to establish ourselves as a European leader, a goal requiring the participation in such benchmarking exercises.
– The cost: it looms especially substantial, particularly for a company that has consistently prioritized R&D investments since its founding;
– The time: the preparation and execution of tests require a substantial commitment, with resources already stretched thin by development, maintenance, and customer support obligations;
– The risk: if we obtain unfavorable results or, for any reason, fail to showcase our best performance, we cannot impeach the publication of these results.
The marketing team offered their perspectives, the CTO spearheaded discussions, the CTI team meticulously weighed the pros and cons, and the CFO did the same. Heated debates unfolded, but ultimately, a unanimous decision was reached: in 2023, we are officially embarking on our MITRE journey!
Assembling the Task Force
A task force was swiftly assembled, comprising the following key members:
– a project manager;
– The CTI team;
– DevOps, back-end, and front-end support.
In no time, the identity of the simulated attacker group was revealed: TURLA. TURLA a historically recognized Russian-speaking cyber espionage group, has long been suspected of having ties to the Russian Federation’s military intelligence service, as detailed by Sekoia.
Managing information overload
One might wonder if having prior knowledge of the planned scenarios could skew the analysis of the results: it’s important to note that MITRE’s primary objective is to assess detection capabilities, and these conditions have always been a part of the evaluation process. Paradoxically, having foreknowledge of the attacker’s identity presents its own set of challenges.
Indeed, the CTI team found themselves overwhelmed with a wealth of information regarding this attacker group, encompassing their tactics, tools, and practices. The task of identifying relevant sources, extracting critical insights, and translating them into actionable detection rules proved to be time-consuming.
To effectively manage this data overload, the team made a strategic decision to develop an application that would facilitate the tracking of tools used by TURLA, the techniques they employed, the elements to be tested, and the origins of the created rules.
With the assistance of this powerful tool, we were able to expediently compile a comprehensive repository of documentation. The tactics employed by TURLA were adeptly transformed into detection rules, enriching our existing detection patterns. Notably, these rules would serve not only the MITRE evaluation but also our clients, fortifying their cybersecurity defenses.
Preparing for the unknown
In February, everything is ready: an instance was provisioned, agents were deployed within the test environment provided by MITRE Evaluation. However, a significant challenge loomed for this inaugural attempt – we have NO idea how the evaluation process works, we are the first among the 31 vendors to go through it, venturing into uncharted territory.
Come March 1st, 2023, the entire CTI team was assembled, cramming into a room far too small for comfort. We took any available seat, squeezed in wherever we could, fully aware that a unique and challenging experience was ahead. The Teams meeting started with the MITRE evaluation team providing an introductory briefing.
Finally, the shroud of mystery was lifted: the Red Team would initiate tests within their lab environment for approximately fifteen minutes, after which we are tasked with elucidating the attacks we observed, or failed to detect.
During those intense 15 minutes, alerts flooded in, the console buzzed with activity, and the contours of the attack started to emerge. It was a strange feeling—never before had we been so relieved to see a successful attack. Subsequently, a barrage of questions rained down upon us:
“Could you demonstrate the evidence of this detection on your platform, please?” – This question would persist over the course of three days, and it was posed for nearly 150 distinct sub-techniques that needed to be identified.
“All hands on deck”: the tension in the room escalated. A first analyst swiftly identified an intriguing security event, while a second analyst pinpointed a more context-rich alert. The third analyst shared a link via Slack. The fourth analyst grabbed the microphone to confirm the veracity of our alert, all the while in the background, the rest of the team began tackling the remaining sub-techniques to present. Excitement was at its zenith, and in the midst of the action, we momentarily lost our bearings, scattering in every direction. It became clear that we needed to regroup.
With a collective breath, we halted everything and embarked on a process of restructuring. A decision was made: only one analyst’s screen would be displayed, and a designated spokesperson would represent us. The CTI team split into two groups: the first focusing on validating the current evaluation sub-technique and providing concrete detection evidence, while the second group preemptively prepared for the upcoming sub-techniques. This time, the discussions flowed seamlessly, and we embarked on a marathon of justification lasting a grueling nine hours.
As the first day concluded, so did the first scenario. Pizzas were hastily ordered, and we indulged in a well-deserved break. The initial assessment of the results from the first scenario was promising, fueling our anticipation for what challenges the second scenario might bring.
Seasoned veterans for the second scenario
As the second day dawns, we found ourselves seasoned veterans. The Red Team presented its scenario, and we meticulously dissected each element, scrutinizing every sub-step. We identified and classified what has been detected, distinguishing between telemetry, tactics, and techniques. After another grueling nine-hour marathon, the second scenario came to an end. Once again, the results appeared highly promising.
A third and final day was earmarked for the “re-run” of the tests. This time, MITRE’s Red Team performed both scenarios concurrently, affording us the opportunity to enhance our detection capabilities for greater insight into the scenarios played. The primary objective was to assign tags to tactics and techniques, as these tags carry substantial weight in the evaluation. Although most of our rules already had these labels, the re-play exercise allowed us to be more precise and explicit, a boon for all our clients.
After three days filled with screenshots, justifications, and evidence presentations, it seemed that we had reached the finish line in the MITRE tests. However, it wasn’t quite over. We still needed to meticulously review the 225 screenshots captured and ensure that each sub-technique has been thoroughly addressed by our counterparts—an arduous yet indispensable task.
The wait for results
We waited impatiently for months as all competitors underwent their testing, and MITRE diligently gathered evidence from each vendor. By the end of August, HarfangLab’s results were consolidated and known only to us, under embargo. Excellent news: they shows that the CTI teams have successfully identified all the aspects of TURLA to detect most of the tools, tactics, and techniques used by this group!
Furthermore, most of our results are labeled as Technique, the most precise detection, indicating that beyond detection, the visibility provided on the threat is the best possible, a crucial reassurance element for users of our console.
After a week of waiting, we also discover the results of our competitors. Obviously, some American vendors have results that show more advanced detection: this is quite normal, considering that they are in their 4th participation, are experienced in the exercise, and already knew the rules of the game. Nevertheless, we manage to position ourselves among the best, which was initially a daunting task, considering that MITRE is an exercise with its own rules, codes, and framework….
And what about next year? This time, there will be no debate among the CTO, CTI teams, marketing, or the CFO: we already invite you to follow the next edition of MITRE!