For the purpose of testing an unsupervised anomaly detection algorithm, we need a dataset with both benign and malicious authentication activities. We already have access to benign data, but we lack malicious attack events.</p>
The question we will try to address in this article is: Can we simulate malicious behavior by injecting data we know to be malicious into a benign dataset?
In this article, we are going to focus on trial-and-error password attacks (even if the proposed method also works for other types of attack).
There are two main types of trial-and-error attacks on passwords:
Brute-force attacks: attempts to log on to a given account using several passwords entered one after the other. Passwords can be random or taken from a dictionary of commonly used passwords.
Password spraying attacks: attempts to connect to several accounts using the same password. Once a password has been used to test all the accounts, the process starts again with the next password, and so on. This method avoids the account lock-in that usually occurs when a single account is subjected to a brute-force attack.
For the sake of simplicity, we will consider the modeling of a brute-force attack. However, the following method can also be applied to password spraying attacks, with minimal modifications.
Crafting malicious data and inserting them into ground truth benign data
In a previous article, we had set up a lab of several virtual machines, each part of the same Active Directory, allowing to generate attacks in a controlled environment and get the associated authentication logs.
Here, the problem is the opposite: we have a lot of ground truth benign data, but we can’t have true malicious data on the same endpoints, as it would mean performing attacks on real machines, used daily by real humans (and finding humans that both accept to be attacked while they work and can be legally attacked is quite hard).
Using what we know about the events generated by attacks made using tools like Hydra, Kerbrute, or crackmapexec, we could craft these events and insert them into the benign events we have.
What do brute-force authentication events look like?
The Windows OS comes bundled with a very useful source of data called the Windows Event Log. Any operation made on the system ends up in this log where we can filter events based on multiple attributes (names, dates, category, security level, and so on). Each of the generated event has an ID, specifying what the event is about, and multiple ID-specific fields, giving more information about the context of the event.
Some of the most common authentication event IDs are 4624, 4625 and 4634:
– 4624: An account was successfully logged on.
– 4625: An account failed to log on.
– 4634: An account was logged off. This event shows that logon session was terminated and no longer exists.
The Windows Event Log is extremely convenient to outline any unusual activity, and will be the source of data we will be working with.
As long as the activity we’re trying to detect is somehow linked to authentication processes, we will have events about it. Here is a simple example of a password spraying attack with Hydra, a login cracker module that supports numerous protocols to run attacks:
hydra -L users.txt -P passwords.txt 192.168.0.1 ssh -u -V