The Cymulate Research Team is tasked with providing attack simulations on demand for more than 600 customers. In doing so, we noticed a pattern: Most requests revolve around articles customers find about APTs, malware, or ransomware behaviors, and they come to Cymulate for help to simulate these attacks.
Our researchers spent 20% of their time creating new content based on these customer requests. However, a significant 80% of their time was consumed by analyzing the articles, extracting crucial information, and building advanced scenario templates that align with the requests.
Given the rapid advancements in Large Language Model (LLM) technology, we saw a precious opportunity to leverage this technology for our use case. The journey was complex, and the process itself was challenging. However, with strong motivation and dedication, we believe anything is possible.
From Internal Efficiency to Industry-first Dynamic Attack Planner
The dynamic attack planner in the Cymulate AI Copilot began as an internal initiative for our research team, initially aimed at saving time. Due to its remarkable results, we decided to make it public. The attack planner simplifies life for blue teams and security professionals by providing unprecedented ease in simulating attacks. It offers a significant advantage to our customers, enhancing their use of our product and Breach and Attack Simulation (BAS) capabilities in their daily routines. Whether it’s training blue teams for potential attack scenarios, optimizing SOC detection abilities, interpreting lengthy articles into actionable attack sequences, or testing the efficacy of their security controls, this tool makes it all possible with a high degree of efficiency.
Background on Cymulate Breach and Attack Simulation Advanced Scenarios
First, we need to introduce Cymulate and its capabilities. Cymulate Breach and Attack Simulation offers Advanced Scenarios. These complex automated assessment templates are designed to simulate complex and sophisticated cyberattacks, such as advanced persistent threats (APTs), with real-life attack chains that mirror the exact tactics and techniques used by attackers. These scenarios allow for highly customizable assessments, enabling users to tailor simulations to their specific needs to test how their security controls will respond.
Cymulate Advanced Scenarios has two main capabilities:
- Simulating atomic breach and attack simulations (let’s call them executions), and
- Chaining one attack simulation to another using a concept where the output of one execution becomes the input for the second (let’s call them Chained Executions).
The content, executions and templates are provided by our dedicated Cymulate research team, which constantly updates our attack simulations with the latest methods, techniques and tactics used by threat actors.
How It Works
Let’s dive into how the AI Attack Planner works – from data prep to assembling the chained attack chains into customer-ready security assessments.
Data Preparation: The first step in using AI is to get your data ready. Our execution database consists of thousands of attack methods, each with its own input and output arguments. Initially, we save the executions and their metadata (name, description, running code) into a database and run a process called “Enrichment.” This process adds another layer of metadata representing the actual action being performed by the execution. For example, the “Nmap: Ping Sweep Scan” execution is translated into four different enrichment action types:
- A scan is performed to identify active devices by sending pings to a range of IP addresses
- Active hosts are detected by pinging a range of IP addresses to see who responds
- Pings are sent to multiple IP addresses to find out which devices are online
- The system checks which IP addresses reply to pings to identify active hosts
Embedding: In the next stage, we run a process called embedding, which uses the “ADA 2” encoder to create a vectorized form of the name, description and action enrichment. This embedding process is later used by our cosine similarity-based algorithm to match specific actions or use cases to a list of potential executions that provide this kind of behavior.
For example, “scan network for a host” translates to ten different executions that scan the current network for hosts. This method is common and used by many vendors for text similarity search tasks. During our research, we found that most user prompts/requests focus on an action rather than a specific product (e.g., “dump credentials” rather than “use Mimikatz to dump credentials”). This process significantly improved our search capabilities and matching use cases to executions, enhancing our results.
AI Ranking: After developing an accurate search engine, we recognized room for improvement and decided to use AI. With a use case (“use Nmap to scan the network”) and ten execution options, we use an LLM to rank each execution based on how well its behavior fits the use case.
This approach yielded amazing results and was a crucial breakthrough in this project.
Chaining Executions: Now we start the interesting part: Creating a set of chained executions representing an attack scenario built from a combination of the identified use cases. Essentially, we mimic the logical sequence of actions a normal attacker would perform. For example, to execute lateral movement, one would need credentials/tokens/tickets. If a user’s prompt is “create an attack simulation that performs lateral movement using WMI,” we assume the need for a basic execution of “lateral movement using WMI” and chain it with prerequisites like credential dumping (to obtain user credentials) and finding a host with an open WMI port.
To solve this problem, we create a concept called stories. From the previous section, you might remember we had single use cases ready. Now, we use them in a story form, utilizing an LLM model to create a set of mini stories with logical connections. For example, with the prompt “create an attack simulation that performs lateral movement using WMI,” we would have a story representing “credential dumping followed by lateral movement using WMI.” It adds credential dumping as a prerequisite since lateral movement requires credentials.
We use a graph database to upload all our execution data, define the initial credential dumping execution as the start node, and the lateral movement execution as the end node. We then ask the graph to find the “shortest path” between them. Finally, we use an LLM again to rank the precision of the chained scenario to its original use case, filtering out attack chains that completely mismatch the user’s original prompt. This approach also allows our AI to be more or less creative, which is a neat feature.
Assembling Advanced Scenario Templates: The final step is to assemble advanced scenario templates consisting of atomic breach and attack simulations and chained simulations that match the user’s request.
TL;DR: When a user inputs a text/article/specific request for an attack, the attack planner LLM model disassembles it into single attack-related use cases. For each use case, we find one or two suitable executions. We then use a concept called stories to create a logical flow between single use cases and use a graph database to find the shortest path between one use case and another.
Just Getting Started
This is only the beginning. Stay tuned to learn more about our future plans and exciting developments that are on the horizon.
The Cymulate AI Copilot is now in beta with a general availability set for the next 30 days. To see the power of the most advanced security and exposure management platform and its new AI-powered assists, click here to request a demo.