
Image by Solen Feyissa, from Unsplash
Major AI Agents Found Vulnerable to Hijacking, Study Finds
Some of the most widely used AI assistants from Microsoft, Google, OpenAI, and Salesforce can be hijacked by attackers with little or no user interaction, according to new research from Zenity Labs.
In a rush? Here are the quick facts:
- ChatGPT was hijacked to access connected Google Drive accounts.
- Microsoft Copilot Studio leaked CRM databases from over 3,000 agents.
- Google Gemini could be used to spread false information and phishing.
Presented at the Black Hat USA cybersecurity conference, the findings show that hackers could steal data, manipulate workflows, and even impersonate users. In some cases, attackers could gain “memory persistence,” allowing long-term access and control.
“They can manipulate instructions, poison knowledge sources, and completely alter the agent’s behavior,” Greg Zemlin, product marketing manager at Zenity Labs, told Cybersecurity Dive. “This opens the door to sabotage, operational disruption, and long-term misinformation, especially in environments where agents are trusted to make or support critical decisions.”
The researchers demonstrated full attack chains against several major enterprise AI platforms. In one case, OpenAI’s ChatGPT was hijacked through an email-based prompt injection, allowing access to connected Google Drive data.
Microsoft Copilot Studio was found leaking CRM databases, with more than 3,000 vulnerable agents identified online. Salesforce’s Einstein platform was manipulated to reroute customer communications to attacker-controlled email accounts.
Meanwhile, Google’s Gemini and Microsoft 365 Copilot could be transformed into insider threats, capable of stealing sensitive conversations and spreading false information.
Additionally, researchers were able to trick Google’s Gemini AI into controlling smart home devices. The hack turned off lights, opened shutters, and started a boiler without resident commands.
Zenity disclosed its findings, prompting some companies to issue patches. “We appreciate the work of Zenity in identifying and responsibly reporting these techniques,” a Microsoft spokesperson said to Cybersecurity Dive. Microsoft said the reported behavior “is no longer effective” and that Copilot agents have safeguards in place.
OpenAI confirmed it patched ChatGPT and runs a bug-bounty program. Salesforce said it fixed the reported issue. Google said it deployed “new, layered defenses” and stressed that “having a layered defense strategy against prompt injection attacks is crucial,” as reported by Cybersecurity Dive.
The report highlights rising security concerns as AI agents become more common in workplaces and are trusted to handle sensitive tasks.
In another recent investigation, it was reported that hackers can steal cryptocurrency from Web3 AI agents by planting fake memories that override normal safeguards.
The security flaw exists in ElizaOS and similar platforms because attackers can use compromised agents to transfer funds between different platforms. The permanent nature of blockchain transactions makes it impossible to retrieve stolen funds. A new tool, CrAIBench, aims to help developers strengthen defenses.