Google Discloses Critical AI Agent Vulnerability Exploiting Prompt Injection

SAN FRANCISCO — Google disclosed a critical security vulnerability in its Antigravity AI agent manager on Sunday, confirming that the flaw allowed attackers to escape the system's sandbox and achieve remote code execution through prompt injection.

The vulnerability, identified by researchers from Pillar Security, combined a prompt injection attack with the tool's file-creation capability to bypass security controls. The researchers reported the issue to Google, which has since addressed the flaw in a security update.

Antigravity is an AI agent management system designed to automate complex tasks by allowing artificial intelligence models to interact with external tools and files. The system operates within a restricted sandbox environment intended to prevent unauthorized access to the host operating system. However, the researchers demonstrated that an attacker could manipulate the AI agent's input to trick it into creating malicious files within the sandbox. Once created, these files could be executed, allowing the attacker to break out of the sandbox and gain control over the host system.

The attack vector relied on the AI agent's ability to interpret and execute instructions based on user prompts. By crafting a specific prompt, an attacker could instruct the agent to write a malicious script to a file and then execute it. This sequence of actions bypassed the intended security boundaries, leading to remote code execution.

Google's security team acknowledged the severity of the issue and worked with Pillar Security to develop a patch. The update restricts the file-creation capabilities of the AI agent within the Antigravity environment, preventing the execution of unauthorized code. The company stated that the vulnerability was fixed before it could be exploited in the wild.

The disclosure highlights the growing security challenges associated with AI agents that have the ability to interact with external systems. As AI models become more capable and are integrated into more complex workflows, the potential for prompt injection attacks increases. Security experts warn that similar vulnerabilities could exist in other AI agent systems that allow file creation or system interaction.

Pillar Security researchers emphasized the importance of rigorous testing and security controls for AI systems. They noted that the vulnerability was found during routine security research and that responsible disclosure practices were followed throughout the process.

The incident has raised questions about the security architecture of AI agent managers and the need for more robust sandboxing mechanisms. While Google has addressed the specific vulnerability in Antigravity, the broader implications for AI security remain a concern for developers and security professionals.

Google has not provided details on whether any other systems are affected by similar vulnerabilities. The company is expected to release further guidance on securing AI agents in the coming weeks. Researchers continue to monitor the landscape for new threats as AI technology evolves.