Google's Antigravity AI Agent Had a Sandbox-Escape Flaw Enabling Remote Code Execution

A Patched Flaw in Google's Antigravity Exposed the Limits of AI Agent Sandboxing

As enterprises increasingly adopt agentic AI systems within their business and IT infrastructure, security researchers are finding that these tools can dramatically expand an organization's attack surface. The latest example comes from Pillar Security, whose researchers this week disclosed a vulnerability in Antigravity, a Google-built, AI-powered developer tool designed to handle filesystem operations. The flaw, which has since been patched, allowed attackers to chain prompt injection with Antigravity's built-in file-creation capabilities to achieve remote code execution.

What Antigravity's Secure Mode Is Supposed to Do

At the heart of the issue is Antigravity's Secure Mode, which Google markets as its highest security configuration for the agent. When enabled, Secure Mode routes all command operations through a virtual sandbox environment, restricts network access, and prevents the agent from writing code outside of its designated working directory. The intent is to limit the agent's access to sensitive systems and block it from executing malicious or dangerous shell commands.

In practice, however, one critical component slipped through those controls entirely.

The 'find_by_name' Loophole

The vulnerability stemmed from one of Antigravity's file-searching utilities, a tool called "find_by_name." This particular tool is classified as a native system tool, meaning the agent can invoke it directly — and critically, it can do so before protections like Secure Mode have any opportunity to evaluate command-level operations.

Dan Lisichkin, an AI security researcher at Pillar Security, explained the implications clearly in his write-up:

"The security boundary that Secure Mode enforces simply never sees this call. This means an attacker achieves arbitrary code execution under the exact configuration a security-conscious user would rely on to prevent it."

In other words, the very setting users would trust to keep them safe was the setting under which the attack was fully operational.

How Prompt Injection Enables the Attack

Exploiting the vulnerability did not require elevated access or sophisticated intrusion techniques. The attack vector is prompt injection — a method by which malicious instructions are embedded in content that the AI agent reads and processes.

Antigravity's core weakness in this regard is its inability to reliably distinguish between data it ingests for contextual understanding and literal executable instructions. As a result, attackers can compromise the agent simply by getting it to read a malicious document or file. There are multiple delivery mechanisms for such an attack:

Compromised identity accounts that are already connected to the agent
Malicious instructions hidden inside open-source files the agent accesses
Clandestine prompt directives embedded within web content the agent ingests

None of these approaches require the attacker to have any privileged foothold in the target environment beforehand.

Disclosure Timeline and Bug Bounty

According to the disclosure timeline provided by Pillar Security, the vulnerability was reported to Google on January 6 and subsequently patched on February 28. Google awarded a bug bounty to the researchers for the discovery.

A Broader Pattern Across AI Coding Agents

Lisichkin noted that this type of exploit — prompt injection through unvalidated input — is not unique to Antigravity. He pointed out that the same pattern has been identified in other AI coding agents, including Cursor. The underlying issue is systemic: any unvalidated input in an agentic AI system can potentially become a malicious prompt capable of hijacking internal operations.

This raises serious questions about the trust assumptions embedded in current AI security architectures. As Lisichkin wrote:

"The trust model underpinning security assumptions, that a human will catch something suspicious, does not hold when autonomous agents follow instructions from external content."

Rethinking Security for the Agentic AI Era

The fact that this vulnerability was able to fully bypass Google's Secure Mode — the company's top-tier protection setting — is a signal that the cybersecurity industry needs to move well beyond sanitization-based controls when it comes to AI agents.

Lisichkin was direct about what this means for developers and security teams shipping agentic features:

"Every native tool parameter that reaches a shell command is a potential injection point. Auditing for this class of vulnerability is no longer optional, and it is a prerequisite for shipping agentic features safely."

As organizations continue deploying AI agents for increasingly sensitive tasks, incidents like this one serve as a reminder that conventional security perimeters and trust models must be fundamentally reassessed. The automation that makes these tools powerful is the same characteristic that makes unchecked input so dangerous.

Source: CyberScoop

Source: CyberScoop

Google's Antigravity AI Agent Had a Sandbox-Escape Flaw Enabling Remote Code Execution

A Patched Flaw in Google's Antigravity Exposed the Limits of AI Agent Sandboxing

What Antigravity's Secure Mode Is Supposed to Do

The 'find_by_name' Loophole

How Prompt Injection Enables the Attack

Disclosure Timeline and Bug Bounty

A Broader Pattern Across AI Coding Agents

Rethinking Security for the Agentic AI Era

Powered by ZeroBot

Related Articles

CVE-2026-1354 & CVE-2025-70994: Zero Motorcycles and Yadea Scooters Exposed to Physical Safety Hacking Risks

CrowdStrike and Tenable Patch Critical and High-Severity Flaws in Key Security Products

More Than 10,000 Zimbra Servers Remain Unpatched Amid Active XSS Exploitation