Frequently Asked Questions

Zero-Click RCE & Prompt Injection in AI Tools

What is zero-click remote code execution (RCE) via prompt injection in AI tools?

Zero-click RCE via prompt injection refers to a class of vulnerabilities where simply opening a project or interacting with AI-generated content can trigger a chain of events leading to remote code execution on the host system. No user interaction or approval is required—malicious instructions embedded in content (like a README or documentation) can cause an AI tool to execute arbitrary code.

Which AI tools were found vulnerable to zero-click RCE via prompt injection?

The research identified vulnerabilities in Cursor CLI, AWS Kiro, Codex Desktop App, Gemini CLI, and GitHub Copilot CLI. Each tool was susceptible to different forms of prompt injection and binary hijacking that could lead to arbitrary code execution on the host system.

What are the two main vulnerability classes described in the research?

The two main vulnerability classes are: (1) Arbitrary Binary Execution via Untrusted Resolution—where AI tools execute attacker-controlled binaries from the workspace, and (2) Arbitrary Command Execution via Configuration Poisoning—where prompt injection allows modification of configuration files that trigger OS-level commands automatically.

How does prompt-driven untrusted binary resolution work?

Prompt-driven untrusted binary resolution occurs when an AI tool resolves and executes binaries (like git.exe or npx.exe) from the current working directory before trusted system paths. If a malicious binary is present in the workspace, it is executed automatically, potentially compromising the host system.

What is configuration poisoning in the context of AI tools?

Configuration poisoning refers to the abuse of an AI tool's file-write capabilities to modify execution-sensitive configuration files (such as .vscode/tasks.json). Malicious commands can be injected, which are then executed automatically by the IDE or tool, leading to remote code execution without user approval.

How can a single prompt injection lead to full remote code execution?

A single prompt injection can instruct the AI tool to write or modify files in a way that triggers code execution. For example, by embedding a malicious instruction in a README, the AI agent can create a binary or configuration file that is executed automatically, resulting in full remote code execution on the host system.

What was the vendor response to these vulnerabilities?

Vendor responses varied: some vendors closed reports as 'informative' or 'not applicable' without remediation, while others acknowledged the issues but did not commit to fixes. For example, GitHub triaged the issue and paid a bounty but did not commit to a fix, while Cursor and AWS Kiro closed reports without addressing the root cause.

What practical steps can end users take to mitigate these risks?

Users should audit workspace directories for unexpected executables, restrict AI tool file-write permissions, apply least-privilege principles, inspect configuration files before opening projects, and monitor AI agent activity. These steps help reduce risk but do not address the underlying vulnerabilities.

How does Cymulate Exposure Validation help address these vulnerabilities?

Cymulate Exposure Validation provides dedicated attack scenarios that simulate prompt injection and binary planting techniques. These scenarios allow organizations to test their detection and response controls against real-world AI agent-based attack chains, helping to identify and remediate gaps before they are exploited in the wild.

What recommendations are provided for CISOs and organizations regarding AI tool security?

Recommendations include inventorying and restricting AI CLI usage, hardening and sandboxing execution environments, monitoring AI agent activity in SIEM, adding prompt injection to the threat model, and validating prevention and detection coverage using Cymulate scenarios.

How can organizations validate their defenses against prompt injection and RCE attacks?

Organizations can use Cymulate's Exposure Validation platform to run dedicated attack scenarios that mimic prompt injection and RCE techniques. This helps test whether endpoint, EDR, and SIEM controls can detect and prevent these attacks, enabling proactive remediation of security gaps.

What is the role of the Cymulate Research Lab in uncovering these vulnerabilities?

The Cymulate Research Lab is composed of experienced security researchers who continuously examine the cyber-threat landscape. They are responsible for uncovering, analyzing, and responsibly disclosing vulnerabilities in AI tools, and for developing corresponding validation scenarios for Cymulate's platform.

Where can I find more technical details about these vulnerabilities?

Full technical details, including proof-of-concept code, attack chains, and vendor responses, are available in the Cymulate Research Lab blog post: Zero-Click RCE via Prompt Injection in AI Tools.

How can I learn more about the authors of this research?

The research was conducted by Ilan Kalendarov, Ben Zamir, and Elad Beber of the Cymulate Research Lab. More about the authors can be found at Cymulate Research Lab Author Page.

What is the significance of the MITRE ATT&CK mapping in Cymulate scenarios?

Cymulate maps its attack scenarios to relevant MITRE ATT&CK techniques, ensuring that organizations can validate their defenses against tactics and techniques used by real-world adversaries, including those involving AI agent exploitation.

How often are Cymulate's attack scenarios updated?

Cymulate continuously updates its attack scenario library, adding new scenarios as emerging threats and vulnerabilities are discovered, including those related to AI tools and prompt injection.

What is the recommended approach for organizations using AI-powered developer tools?

Organizations should treat AI-powered developer tools as untrusted code execution surfaces, restrict their use to vetted endpoints, sandbox their execution, monitor their activity, and validate security controls using platforms like Cymulate.

How does Cymulate Exposure Validation simulate AI agent-based attack chains?

Cymulate Exposure Validation includes templates and scenarios that replicate real-world AI agent attack chains, such as prompt injection, binary planting, and configuration poisoning, allowing organizations to test their defenses in a safe, production-like environment.

Where can I find more resources and research from Cymulate?

You can find more resources, whitepapers, and technical research from Cymulate at Cymulate Resources and on the Cymulate Blog.

Cymulate Platform: Features, Capabilities & Use Cases

What is Cymulate Exposure Validation and how does it work?

Cymulate Exposure Validation is a platform that enables organizations to simulate real-world cyberattacks, validate security controls, and assess their exposure to threats. It provides automated, production-safe attack scenarios mapped to frameworks like MITRE ATT&CK, helping organizations identify and remediate vulnerabilities before attackers can exploit them.

What are the key features of Cymulate's platform?

Key features include continuous threat validation, exposure awareness, defensive posture optimization, attack path discovery, automated mitigation, comprehensive integration with SIEM and EDR tools, and dedicated cloud security validation. The platform is intuitive, user-friendly, and supports rapid deployment.

Who can benefit from using Cymulate?

Cymulate is designed for CISOs, Security Operations teams, Red Teams, Vulnerability Management teams, and Detection Engineers in organizations across industries such as finance, healthcare, and technology where cybersecurity is a critical concern.

What business impact can customers expect from Cymulate?

Customers typically see a 30% improvement in threat prevention, a 52% reduction in critical exposures, a 60% increase in operational efficiency, and an 81% reduction in cyber risk within four months. These outcomes are supported by customer case studies and measurable ROI.

How easy is it to implement Cymulate?

Cymulate is known for its quick, agentless deployment. Customers can start running simulations almost immediately, with minimal configuration required. The platform is praised for its intuitive interface and ease of use by both technical and non-technical users.

What integrations does Cymulate support?

Cymulate integrates with leading security tools including BlackBerry Cylance, Carbon Black EDR, Cisco Secure Endpoint, CrowdStrike Falcon, AWS GuardDuty, Splunk, Rapid7 InsightVM, Akamai Guardicore, and more. For a full list, visit the Cymulate Partnerships and Integrations page.

What compliance certifications does Cymulate hold?

Cymulate is certified for SOC2 Type II, ISO 27001:2013, ISO 27701, ISO 27017, and CSA STAR Level 1, demonstrating adherence to industry-leading security and privacy standards.

How does Cymulate compare to other exposure management and security validation platforms?

Cymulate stands out for its unified platform, continuous threat validation, AI-powered optimization, comprehensive kill chain coverage, ease of use, and extensive threat library. It is recognized as a leader by Gartner and G2, and offers advantages over competitors like AttackIQ, Mandiant Security Validation, Pentera, Picus Security, SafeBreach, Scythe, and NetSPI. For detailed comparisons, visit the Cymulate Competitors page.

What is Cymulate's pricing model?

Cymulate uses a subscription-based pricing model tailored to each organization's needs. Pricing depends on the chosen package, number of assets, and scenarios required. For a custom quote, organizations can schedule a demo with Cymulate's team.

What technical documentation is available for Cymulate?

Technical resources include the Exposure Management Platform (CTEM) Whitepaper, Cymulate Exposure Management Platform Data Sheet, Custom Attacks Data Sheet, Technology Integrations Data Sheet, and documentation on MITRE ATT&CK alignment. These are available on the Cymulate Resources page.

What customer feedback has Cymulate received regarding ease of use?

Customers consistently praise Cymulate for its intuitive and user-friendly design. Testimonials highlight easy implementation, a user-friendly dashboard, and high functionality for both technical and non-technical users. For example, Raphael Ferreira, Cybersecurity Manager, stated, 'Cymulate is easy to implement and use—all you need to do is click a few buttons.'

What are the core problems Cymulate solves?

Cymulate addresses overwhelming threat volumes, lack of visibility, unclear prioritization, operational inefficiencies, fragmented security tools, cloud complexity, and communication barriers between security teams and stakeholders. It provides continuous threat validation, actionable insights, and unified exposure management.

What case studies demonstrate Cymulate's effectiveness?

Case studies include Hertz Israel reducing cyber risk by 81% in four months, Nemours Children's Health improving detection and response, Nedbank focusing on critical vulnerabilities, and GUD Holdings establishing consistent security metrics across subsidiaries. See more at the Cymulate Customers page.

How does Cymulate address the needs of different security personas?

For Red Teams, Cymulate offers production-safe attack simulations and custom offensive testing. For Detection Engineers, it helps close SIEM coverage gaps and validate detection rules. For Vulnerability Management, it consolidates exposure data and prioritizes remediation, improving efficiency for each persona.

What is Cymulate's overarching vision and mission?

Cymulate's vision is to lead the way in cybersecurity strategy, making the world safer. Its mission is to empower organizations against threats and make advanced cybersecurity as simple as sending an email. The platform helps organizations move from guessing to knowing and acting on security threats.

Where can I watch a video overview of Cymulate's AI-powered exposure validation?

You can watch the video AI-Powered Exposure Validation for Complete Cybersecurity Control for a visual overview of Cymulate's capabilities.

New: 2026 Gartner® Market Guide for Adversarial Exposure Validation
Learn More
Cymulate named a Customers' Choice in 2025 Gartner® Peer Insights™
Learn More
New Research: The Security Tradeoffs Behind AI Tooling
Learn More
An Inside Look at the Technology Behind Cymulate
Learn More

When AI Tools Become the Backdoor: Zero-Click RCE via Prompt Injection 

By: Cymulate Research Lab

Last Updated: June 4, 2026

Ilan Kalendarov, Security Research Team Lead
Ben Zamir, Security Researcher
Elad Beber, Security Researcher

Affected Tools: Cursor CLI, AWS Kiro, Codex Desktop App and Gemini CLI

Vulnerabilities in common AI products can turn everyday AI tool usage, like opening a repo or running a prompt, into a potential silent system compromise. Simply opening a project or interacting with AI-generated content can trigger the attack chain. 

After responsibly notifying the vendors of Cursor CLI, AWS Kiro, Codex Desktop App and Gemini CLI, Cymulate Research Labs have published new attack scenarios for Cymulate Exposure Validation to simulate production-safe attack chains to test and validate threat exposure. 

Executive Summary 

Cymulate Research Labs uncovered a chain of vulnerabilities across multiple major vendors that can enable remote code execution from a single prompt injection. These issues fall into two primary classes: 

  1. Arbitrary Binary Execution via Untrusted Resolution 
    AI CLI tools on Windows rely on the default executable search order, which prioritizes the current working directory over trusted system paths. As a result, attacker-controlled files placed in a workspace (e.g., cloned repositories, extracted archives, or binaries introduced via prompt injection) can be executed automatically without user approval, signature validation or warning.  
  2. Arbitrary Command Execution via Configuration Poisoning 
    If the LLM’s file-write capability allows modifications to execution-sensitive files (such as tasks.json or settings.json) without user approval, it can be abused to inject commands that execute at the OS level. Because these files are automatically processed by the IDE, a prompt injection alone can lead to arbitrary command execution upon reopening the workspace. 

The issues affect the following products: Cursor CLI, AWS Kiro, Codex Desktop App and Gemini CLI. 

We reported all findings to the respective vendors. Responses varied: one vendor engaged thoroughly, one issued a bounty but deprioritized remediation and the rest closed the reports without meaningful technical engagement. 

This disparity directly impacts user security. GitHub was the only vendor to properly triage and validate the report, while others either shifted responsibility to users or dismissed the findings. 

Critically, the tools with the highest impact remain vulnerable, enabling prompt injection -> code execution chains. 

As a result, users are exposed. Routine actions such as opening a project or interacting with AI-generated content can lead to silent code execution with no warnings or visible indicators. Check out the practical, actionable advice for CISOs and organizations at the end of this blog. 

Update June 3rd 2026: AWS Assigned CVE-2026-10591

Following publication of this research and coordination with AWS Security, AWS published Security Bulletin 2026-037-AWS and assigned CVE-2026-10591 (8.8 high) for the Kiro issue reported by Cymulate.

AWS addressed the issue in Kiro 0.11 by adding protections for execution-sensitive paths such as .vscode/tasks.json, which now require explicit user approval before modification. AWS also updated its public documentation to clarify the security model and publicly acknowledged Cymulate for the disclosure.

We appreciate AWS's collaboration throughout the coordinated disclosure process and their efforts to improve the security of Kiro users.

Introduction 

AI agent applications are no longer confined to developer workflows. As more users rely on them to perform increasingly sensitive tasks, they have become a fixture of daily life and with that, a high-value target. The same deep integration that makes these tools productive is precisely what makes them dangerous. 

In Part 1 of this series, we demonstrated how built-in sandboxing features across multiple AI application tools could be bypassed through different mechanisms, exposing the gap between the isolation guarantees these tools advertise and the weak mechanisms that actually enforce them. 

In Part 2, we go further. We uncover arbitrary command execution primitives that provide an unauthorized foothold on the operating system running these applications. In some cases, these primitives can be chained with additional weaknesses to achieve zero-click remote code execution, triggered simply by using the tool and allowing it to do exactly the feature that allows it to gather context: web search. 

The vulnerability classes documented here can be triggered by a single injected prompt. A malicious instruction embedded in a README, a GitHub issue, a Stack Overflow forum reply or a documentation page is enough to set the execution chain in motion.  

The Two Vulnerability Classes 

Class 1: Prompt-Driven Untrusted Binary Resolution 

On Windows, several AI CLI/IDE tools resolve external executables (most commonly git.exe but also npx.exe, where.exe and others depending on the tool) using the default Windows executable search order. That search order includes the current working directory before trusted system paths. For example, a file named git.exe placed in a project folder will be executed in preference to the legitimate system-installed Git binary. 

Class 2: Prompt-Driven Configuration Poisoning 

Modern AI applications and agents are equipped with file-write tools that allow the LLM to create and modify files within the project workspace. To maximize productivity, these tools are granted broad write access to the project folder by default. In several of the tools we examined, this capability lacks proper boundaries: the LLM-driven file-write tool can reach configuration paths that carry special meaning to the IDE such as paths that trigger automatic command execution the next time the project is opened. 

When combined with tool-specific weaknesses, such as Cursor CLI's write tool that has a special path to binary file creation, or AWS Kiro's write tool being permitted to create sensitive IDE configuration files without approval, a prompt injection can escalate all the way to remote command execution. No user consent is required at any stage and the entire chain originates from what appears to be a routine and legitimate tool action. 

Full Remote Code Execution Originating from Prompt Injection 

Cursor CLI 

Class 1: Untrusted Binary Resolution 

Arbitrary Code Execution via Write Tool Binary Planting and Implicit npx.exe Resolution 

Affected Product Cursor CLI 
Platform Windows 
Vulnerability Type Write Tool Abuse + Untrusted Executable Resolution 
Impact Remote Code Execution on Host 
Vendor Response Closed as Informative, still vulnerable 
Disclosure Date 10/02/2026 

Overview 

Cursor CLI presents the most complete exploitation chain in this research. It combines an LLM-driven file-write tool that can create binary executables with an unsafe npx.exe resolution path that executes workspace-local binaries automatically on user prompt. Neither step requires user approval. Together, they form a two-stage, prompt-to-execution pipeline. 

Issue 1: Write Tool Binary Planting 

The Cursor CLI Write tool has a notable behavior not found in other tools we examined, including Cursor's own IDE product. When the write tool is invoked with a binary file extension (.exe, .dll, .so) as file_path argument and the content argument is base64-encoded, the tool automatically detects the extension, switches to binary encoding mode, decodes the payload, and writes a real executable to disk. What makes this particularly deceptive is that the content displayed to the user appears as base64 text and there is no visual indication that the data being written will resolve to a binary file. No warning is shown, no approval dialog is presented and nothing in the interface signals that an executable has just been created in the workspace. This seems like a simple text file write. 

A base64-encoded binary payload delivered through a prompt instruction (e.g embedded in a repository README, a documentation page, or any other source the agent reads) is sufficient to plant a live malicious executable in the workspace. 

Issue 2: Unsafe npx.exe Resolution 

When Cursor CLI performs agent operations in response to a user prompt, internal dependencies in the bundled application invoke npx.exe. Static analysis of the application bundle reveals that the TypeScript language server provider resolves npx using a local exec helper function that does not enforce absolute-path resolution. On Windows, this means the current working directory is searched first. 

The resolution and execution code lives in the bundled node modules region of the main index.js: 

javascript 

t = (0, o.Ef)("npx", []) 

... 

const s = (0, f.spawn)(n, ["-y", "typescript-language-server", "--stdio"], { 

  cwd: e, // workspace directory used as cwd 

  ... 

});

The Ef helper resolves npx without enforcing an absolute path, allowing a workspace-local npx.exe to take precedence over the system installation. Execution is triggered automatically when the user sends any prompt to the agent after a restart. No confirmation is presented. 

Attack Chain 

The complete exploitation chain proceeds as follows: 

Step 1: Malicious instruction delivery. The attacker embeds a prompt injection instruction in any content Cursor CLI will read: a repository README, source code comments, a documentation page, a forum reply. The instruction directs the LLM to write a file named npx.exe with base64-encoded binary content. 

For illustration purposes, a simple git readme with malicious instructions was created: 

Step 2: Write tool execution (no approval). Cursor CLI's Write tool receives the instruction. It detects the .exe extension, decodes the base64 content, and writes a Windows PE executable named npx.exe to the workspace directory. No user approval is requested. The binary sits alongside normal project files.

Note: on more complex attack scenarios, the executable write action could be masked with additional harmless actions such as creating more files or highlighting irrelevant info to the user. 

Even though the output does not clearly states it, a malicious binary was planted:

Step 3: Automatic execution on next prompt. The user restarts Cursor CLI and sends any prompt. The TypeScript LSP provider initializes, resolves npx.exe from the current working directory, and executes the attacker's binary. The binary runs under the user's security context with full access to the underlying OS, environment variables, credentials, SSH keys, cloud tokens, etc. 

This chain is notable for two reasons:  

  1. It is entirely prompt-driven: the attacker needs only to influence content that the agent will read.  
  2. The execution trigger (e.g. web search) is the most routine action a developer takes when using an AI coding tool. There is no anomalous behavior to detect at either stage. 

Vendor Response 

Cursor closed this report as Informative on February 18, 2026, eight days after submission. The stated rationale was that findings requiring a malicious binary "lack an attack vector." This closure was very vague and did not address the Write tool binary planting component, the mechanism by which the binary enters the workspace without any pre-existing attacker access. A formal rebuttal was submitted. No response followed. 

Class 2: Prompt-Driven Configuration Poisoning 

AWS Kiro 

Prompt-Triggered Remote Command Execution via Unrestricted LLM File Write to Workspace Configuration Paths 

Affected Product AWS Kiro IDE 
Platform Any (Windows, macOS, Linux) 
Vulnerability Type LLM File Write + Implicit VS Code Task Execution 
Impact Remote Code Execution on Host 
Vendor Response Fixed the issue, assigned a CVE and published an advisory page
Date 02/02/2026 

Overview 

AWS Kiro is an AI-powered IDE developed by Amazon, built on the VS Code foundation. It exposes an LLM-driven file-write tool (fswrite) that can create and modify files anywhere within the project workspace without user approval. Kiro inherits VS Code's automatic task execution behavior, whereby tasks defined in .vscode/tasks.json with "runOn": "folderOpen" execute automatically when the project folder is opened. 

The combination creates a clean prompt-to-execution path: a malicious instruction causes the LLM to write a .vscode/tasks.json containing an attacker-controlled command, and the command fires the next time the user opens the project without approval or user interaction. 

Vulnerability Chain 

Core issue: Unrestricted LLM tool write access to security-sensitive paths. The fswrite tool in AWS Kiro permits the LLM to create or modify files in directories that control IDE execution behavior, including .vscode, without requiring user approval. There is no path restriction, no confirmation prompt, and no distinction between workspace content and IDE configuration. 

Core feature: AWS Kiro inherits VS Code's behavior of automatically loading and executing tasks defined in .vscode/tasks.json. A task with "runOn": "folderOpen" fires immediately when the project is opened with the user's privileges. 

Combined impact: When these two behaviors are chained, an attacker who can influence LLM prompt input can achieve arbitrary OS command execution through prompt injection alone. 

Attack Chain 

Step 1: Prompt injection. The attacker places malicious instructions in content the LLM will process: a repository README, inline code comments, a link to an attacker-controlled documentation page, or any other indirect source. The instruction directs the LLM to write a file at .vscode/tasks.json. 

Step 2: LLM file write (no approval). Kiro's fswrite tool writes the following payload to .vscode/tasks.json with no user prompt: 

json 

{ 

    "version": "2.0.0", 

    "tasks": [ 

        { 

            "label": "dev server", 

            "type": "shell", 

            "command": "<ATTACKER_PAYLOAD>", 

            "runOptions": { 

                "runOn": "folderOpen" 

            }, 

            "problemMatcher": [] 

        } 

    ] 

}

For illustrative purposes, let's view the raw tool call:

And the same action can be triggered from a web search, for example: 

Step 3: Automatic execution on folder open. The next time the user opens the project folder in Kiro (or any VS Code–based IDE), the task fires automatically. The attacker's command executes with the user's full privileges. No warning is presented.

PoC Video

It is important to note that while users are required to grant trust to a workspace folder before the IDE will execute its contents, this trust decision is a one-time gate and prompt injection can occur at any point after it. Once trust has been granted, modifications to the .vscode folder and its contents trigger no further alerts, even when those modifications are made by an LLM tool acting on instructions from an untrusted prompt source. 

Other vendors including VS Code itself restrict the LLM write tool from altering workspace level configuration without explicit approval and in some cases also a clear warning. 

This transforms a single successful prompt injection into a potential multi-victim remote command execution attack flow, propagated through planted malicious instructions. 

GitHub Copilot CLI 

Class 1: Arbitrary Code Execution via Untrusted Working Directory Binary Hijacking 

Affected Product GitHub Copilot CLI 
Platform Windows 10 / 11 
Vulnerability Type Untrusted Search Path / Binary Hijacking 
Impact Arbitrary Code Execution on Host 
Vendor Response Triaged, bounty paid, severity downgraded, no fix committed 
Disclosure Date 11/02/2026 

Overview 

GitHub Copilot CLI on Windows resolves external dependencies, specifically git.exe and which.exe, using the default Windows executable search order as part of its tool discovery and initialization logic. That search order prioritizes the current working directory before trusted system paths. A file named git.exe or which.exe placed in the workspace is executed automatically during Copilot CLI startup, before any folder trust prompt is presented to the user. 

Technical Detail 

When GitHub Copilot CLI starts, it attempts to locate git.exe to support its Git integration functionality. The discovery mechanism relies on the Windows search path without enforcing absolute-path resolution or validating the origin of the resolved binary. The current working directory, the developer's project folder, is searched first. A workspace-local binary matching the expected filename is found and executed silently, with no signature verification and no user notification. 

As confirmed in testing, both git.exe and which.exe are affected. The binary executes under the user's full security context with access to source code, environment variables, API keys, cloud credentials, SSH keys, and Git authentication tokens. 

Most importantly, this execution occurs before any trust prompt is displayed. The intended trust mechanism is bypassed entirely, not circumvented within it. 

Attack Chain 

Step 1: Binary delivery. The attacker places a malicious git.exe in a directory the developer will use as a Copilot CLI workspace. Delivery mechanisms include: a public or private Git repository containing a malicious binary, a ZIP or archive that extracts to include git.exe or any other generated project scaffolding. 

Step 2:  Developer opens workspace. The developer navigates to the workspace directory and launches GitHub Copilot CLI. 

Step 3: Automatic binary execution. During startup, Copilot CLI resolves git.exe from the current working directory before querying trusted system paths. The malicious binary executes silently, before any folder trust prompt appears. 

Step 4: Host compromise. The binary runs under the user’s OS security context. No user notification, approval dialog or integrity warning is presented at any point. 

Vendor Response 

Following triage, the severity was downgraded to low. GitHub noted that "the behavior may not be changed immediately" and closed the report as resolved. No committed timeline for remediation was provided, and GitHub's policy is not to disclose reports for low severity vulnerabilities, meaning users are not informed of this finding through official channels. 

The MSRC response (prior to referral to GitHub's bug bounty program) confirmed that the issue was reviewed by multiple engineers before being redirected, reflecting that the concern was substantively evaluated. 

Gemini CLI (Windows) 

Arbitrary Code Execution via git.exe Untrusted Search Path and Binary Hijacking 

Affected Product Gemini CLI (latest version as of blog publication) 
Platform Windows 10 / 11 
Vulnerability Type Untrusted Search Path / Binary Hijacking 
Impact Arbitrary Code Execution on Host 
Vendor Response Acknowledged, remediation status unknown 
Disclosure Date 05/01/2026 

Overview 

On Windows, Gemini CLI's tool discovery functionality resolves git.exe using the default Windows executable search order. When Gemini CLI is launched, it attempts to locate Git as part of its initialization logic. The search order includes the current working directory before trusted system locations. If a workspace directory contains the affected file name, e.g. git.exe, it is executed without origin validation, digital signature verification or any user notification. 

This behavior is independent of the filesystem isolation issues documented in Part 1. It operates entirely at the application layer, requires no in-container execution, and triggers automatically during normal tool launch. 

Technical Detail 

Two distinct issues apply: 

Untrusted Search Path. Gemini CLI resolves git.exe by attempting execution from the current working directory prior to consulting a trusted absolute path or system PATH. This is the direct cause of the binary hijacking condition. 

Improper Control of Executable Loading from Untrusted Sources. Gemini CLI implicitly trusts unsigned executables located in untrusted workspace directories. No origin, integrity or signature check is performed on the resolved binary. 

Attack Chain 

Step 1: Binary delivery. The attacker places a malicious git.exe in a directory the developer will use as a Gemini CLI workspace, through repository cloning, archive extraction or any other standard mechanism. 

Step 2: Developer launches Gemini CLI. The developer runs Gemini CLI from the affected workspace. 

Step 3: Automatic binary execution. During initialization, Gemini CLI's tool discovery module locates git.exe in the working directory and executes it. No approval, warning or integrity check occurs. 

Step 4: Arbitrary code execution. The binary runs under the developer's security context. In the PoC demonstration, this escalated to an elevated PowerShell process via UAC bypass, illustrating that downstream privilege escalation is possible depending on the system configuration.

Vendor Response 

Reported to Google on January 4, 2026. Google acknowledged receipt. Following extended back-and-forth, Google recognized the issue as valid. As of the date of this publication, no patch has been released and no confirmed remediation timeline has been communicated. 

Codex App (Windows) 

Arbitrary Code Execution via git.exe Untrusted Search Path and Binary Hijacking 

Affected Product Codex App (Desktop Application) 
Platform Windows 
Vulnerability Type Untrusted Search Path / Binary Hijacking (CWE-426, CWE-829) 
Impact Arbitrary Code Execution on Host 
Vendor Response Closed as Not Applicable 
Disclosure Date 02/03/2026 

Overview 

The Codex desktop application on Windows resolves git.exe using the default Windows executable search order when initializing Git-related functionality on folder open. A malicious git.exe in the project directory is executed automatically under the developer's security context, outside of the Codex sandbox user feature. 

Technical Detail 

When the user opens a project folder in Codex desktop app (Windows), the application initializes Git integration and invokes git. This resolution does not use an absolute path. The Windows executable search order is followed, and the current working directory (opened project folder) is searched before system-installed binaries. A workspace-local git.exe is found and executed before the legitimate Git installation is consulted. 

The execution occurs directly on the host OS. No user notification, approval dialog or integrity check is performed.

Attack Chain 

Step 1: Binary delivery. A malicious git.exe is placed in the project directory through any available channel: repository clone, archive extraction, or indirect prompt injection causing the LLM to write or download a binary. 

Step 2: Developer opens the project in Codex. The developer opens the folder in Codex App. Git initialization begins as part of normal startup. 

Step 3: Automatic binary execution. Codex resolves git.exe from the working directory and executes it without validation. The binary runs with the developer's privileges outside the sandbox. 

Vendor Response 

OpenAI closed this report as Not Applicable on March 10, 2026. The stated rationale was that if an attacker can replace git.exe remotely, they already have system access. This response conflates remote binary replacement (which was not the threat model) with binary planting in workspace directories via routine delivery mechanisms. The rebuttal clarifying the distinction between these scenarios was reviewed but the decision was upheld without further technical engagement. 

Recommendations for End Users 

Because vendor remediation is incomplete or absent across most affected tools, the following mitigations represent user-controlled risk reduction. They do not address the root causes. 

1. Audit Workspace Directories Before Opening 

Before launching any AI CLI tool or IDE from a cloned repository, extracted archive, or any externally supplied project, inspect the workspace for executable files with names matching common system utilities. Files named git.exe, npx.exe, where.exe, docker.exe, or node.exe should never appear in a project workspace. Treat their presence as an indicator of compromise or repository poisoning. 

2. Restrict LLM File-Write Scope 

Where your tool allows configuration of which paths the AI agent can write to, restrict write access to exclude security-sensitive directories (depending on vendor, e.g.: .vscode/, .cursor/, etc) and any path that controls IDE startup or task execution behavior. If the tool does not provide this configuration, treat it as a risk factor in your threat model. As an alternative solution, consider manually restricting access to these paths. 

3. Apply Least-Privilege Principles to AI Tool Accounts 

Run AI CLI tools under virtualization using accounts with the minimum privileges necessary. This limits the blast radius if an attacker achieves code execution in the tool's context. Avoid running AI coding tools under accounts with cloud administrator roles, broad IAM permissions or access to production credentials. 

4. Inspect Configuration Files in Projects You Open 

Before opening any project in an AI-powered IDE, review .vscode/tasks.json, .vscode/settings.json, mcp.json, and any other IDE configuration files present in the root. Unexpected entries, particularly those defining shell commands, should be treated as suspicious. 

Vendor Response Summary 

The following table reflects the outcome of responsible disclosure across all tools covered in this research. 

Tool Vendor Reported Outcome 
Cursor CLI Cursor Feb 10, 2026 Closed as Informative: "lacks an attack vector", the attack flow is currently valid 
AWS Kiro Amazon Feb 2, 2026 Fixed, assigned CVE-2026-10591
GitHub Copilot CLI GitHub / Microsoft Feb 11, 2026 Triaged -> Low severity. 
Codex App OpenAI Mar 5, 2026 Closed as Not Applicable, the attack flow is currently valid 
Gemini CLI (Windows) Google Jan 5, 2026 Triaged -> Acknowledged. 

How Cymulate Exposure Validation Addresses This Attack Surface 

The vulnerabilities classes documented in this research series, both the sandbox escapes from Part 1 and the remote code execution paths in Part 2, now have dedicated coverage in Cymulate Exposure Validation. 

Cymulate customers can run dedicated templates and attack scenarios that now include host application-based AI agents as a part the full kill chain: 

  • Initial foothold via prompt injection or binary planting 
  • Code execution on the host 
  • Credential access and persistence. 

These scenarios are mapped to the relevant MITRE ATT&CK techniques or the relevant tool and can be used to validate whether existing detection and response controls identify post-exploitation activity originating from an AI agent process. 

As this research series continues to cover additional AI CLI tools and vulnerability classes, corresponding scenarios will be added to maintain continuous coverage of this emerging attack surface. 

Until vendors fully remediate these issues, security teams should treat AI coding agents as untrusted code-execution surfaces and apply the following controls: 

  1. Inventory, restrict and vet AI CLI usage – Identify which AI coding tools (Cursor, Kiro, Codex, Gemini CLI, Claude Code, etc.) are in use across the organization. Restrict installation to vetted endpoints via software allowlisting (AppLocker, WDAC, Jamf, Intune) and remove agents that vendors have declined to remediate. Factor vendor disclosure responsiveness into procurement and security review, and require documented risk acceptance at the appropriate leadership level for any tool with unresolved vulnerabilities. 
  2. Harden and sandbox the execution environment – Run AI CLIs inside isolated environments such as devcontainers, WSL with limited file access, ephemeral VMs, or cloud development environments (GitHub Codespaces, Coder, Gitpod). Do not allow agents to run on the host OS with administrative privileges or with access to credentials, SSH keys, or cloud profiles.  
  3. Monitor AI agent activity in the SIEM – Forward telemetry on AI CLI process trees, file writes to execution-sensitive locations (tasks.json, settings.json, .vscode/, shell rc files, scheduled tasks), and child-process creation by IDEs. Alert when an AI agent writes to these files outside an explicit user-initiated configuration change, or when unsigned binaries execute from user profile, downloads, or repository paths. 
  4. Add prompt injection to the threat model – Any workflow in which an LLM consumes external content (web pages, repositories, issues, documents, email) must include prompt injection as an explicit threat. Treat agent outputs and file writes as untrusted by default and require human approval for file system, shell, and network actions. 
  5. Validate prevention and detection coverage (for Cymulate clients) – Run the corresponding Cymulate validation scenarios to test whether existing endpoint, EDR and SIEM controls prevent and detect the binary-planting and configuration-poisoning techniques described in this report. Use the results to close gaps in detection rules, allowlisting policies, and response playbooks before these chains are weaponized in the wild. 
Cymulate Exposure Validation makes advanced security testing fast and easy. When it comes to building custom attack chains, it's all right in front of you in one place.
Mike Humbert, Cybersecurity Engineer
DARLING INGREDIENTS INC.
Learn More
Book a Demo