Introduction to OpenClaw AI Agent
The OpenClaw AI agent framework is an open-source platform that enables large language models (LLMs) to interact with real-world systems and perform actions autonomously. It can be used as an email agent for basic reasoning and operations.
Phishing Simulation on OpenClaw AI Agent
Researchers at security firm Varonis created an OpenClaw agent and connected it to a Gmail inbox, browser tools, Google Workspace APIs, and fabricated internal company data sources. They instructed the agent to monitor and process incoming emails, including synthetic enterprise data such as AWS credentials, database credentials, CRM exports, internal communications, and Calendar invites.
The agent was tested with two configurations: a generic one with standard productivity instructions, and a strict mode that included specific instructions for phishing awareness and identity verification procedures. The framework was tested with two models, namely Google Gemini 3.1 Pro and OpenAI GPT-5.4.
Simulated Phishing Attacks
The researchers conducted four simulated phishing attacks and obtained mixed results. The attacks included:
- An attacker impersonating a team lead and requesting access to the staging environment during a purported production issue. The agent located and emailed AWS IAM keys, database credentials, and SSH access details to an external Gmail account.
- The attacker requesting a customer export under the pretext of working remotely on a presentation. The agent retrieved and sent a CRM export containing customer records, contact information, contract details, and revenue data without verifying the sender's identity.
- The agent receiving a fake gift card email containing a phishing link. Under the generic configuration, it visited the phishing site and attempted to redeem the gift card using fabricated credentials before eventually identifying the page as malicious.
- Researchers creating a malicious Google OAuth application disguised as a timesheet platform. The agent inspected the OAuth flow, analyzed the destination, identified the application as suspicious, and refused to grant access.
Conclusion and Recommendations
Varonis concluded that AI agents are good at detecting suspicious URLs, identifying fake login pages, spotting malicious OAuth apps, and recognizing phishing indicators, but may still fail due to a lack of identity verification, loss of context, and inability to apply “zero trust” principles to social interactions.
Varonis recommends that agents should be explicitly required to verify sender identities, be prevented from emailing new external recipients without approval, and have limited access to internal data. For high-risk actions such as credential sharing, financial data requests, and first-time communications, human approval should be requested.
“Varonis Threat Labs explored whether the same phishing techniques that have tricked humans for decades would also work on the AI agents working on their behalf,” reads the report.
The test results showed that the strict mode failed despite the additional safeguards, due to the framework’s failure to validate the sender’s identity. At the model level, Gemini showed greater willingness to interact, while GPT-5.4 had a more cautious posture.
Implications and Future Directions
The study highlights the importance of testing AI agents for vulnerabilities and implementing robust security measures to prevent phishing attacks. As AI agents become more prevalent in enterprise environments, it is crucial to ensure that they are designed and configured to withstand phishing attacks and protect sensitive user data.
Source: BleepingComputer