Image generated with ChatGPT

OPINION: “Vibe Hacking”—The New AI-Powered Cybersecurity Threat

Reading time: 6 min

Last Updated: Sep 9, 2025

Written by Andrea Miliani Former Tech News Expert
Fact-Checked by Sarah Frazier Content Manager

Experts have been warning about the dangers of “vibe hacking” for months, but no significant case had been officially reported until now. Anthropic released a report revealing how malicious actors have been using its advanced AI models to automate cyberattacks.

Tech companies have been promoting generative artificial intelligence as a panacea for everything from daily life to cybersecurity. But advanced technology always cuts both ways: AI tools can be used for good—or for harm. And in both cases, they tend to surprise us.

It was only a matter of time before hackers began exploiting the powerful AI models and AI-driven tools released this year.

A few months ago, everyone was talking about “vibe coding” and how new AI systems allowed people with no coding experience to build websites and apps simply by writing effective prompts.

Now, we are confronting its evil twin: “vibe hacking.” Cybercriminals with little knowledge of software development are building malicious tools capable of massive societal impact.

Anthropic shared its first report on “vibe hacking” in its Threat Intelligence Report: August 2025, revealing how malicious actors have misused its most advanced AI models for sophisticated criminal operations.

From generating personalized ransom notes for victims, to building Ransomware-as-a-Service (RaaS) platforms, to guiding hackers step-by-step through complex cyberattacks—this is what people need to know about “vibe hacking.”

What Is “Vibe Hacking”?

The term “vibe hacking” has been recently adopted to refer to a new threat tactic: malicious actors exploiting advanced AI models to carry out sophisticated, large-scale cyberattacks. Even without deep technical knowledge, hackers are managing to bypass security measures and use powerful AI agents to execute complex operations on their behalf.

In June, WIRED reported that “vibe hacking” was a growing concern among AI experts. Tools like WormGPT and FraudGPT—AI systems built without ethical guardrails—have been circulating since 2023 and are already in the hands of malicious actors.

Experts also noted that jailbreaking frontier AI models had become part of hackers’ daily routines. Still, the idea of AI-driven mass attacks remained hypothetical just months ago. “I compare it to being on an emergency landing on an aircraft where it’s like ‘brace, brace, brace,’ but we still have yet to impact anything,” said Hayden Smith, the cofounder of security company Hunted Labs, in an interview with WIRED.

Now, the plane has landed.

The Vibe Hacking Era Has Arrived

In its latest report, Anthropic revealed that a single hacker, operating from North Korea with only basic coding skills, managed to target 17 organizations worldwide—including government agencies, healthcare providers, religious institutions, and even emergency services.

The attacker relied on Anthropic’s agentic coding tool, Claude Code, to carry out the campaign. The AI system advised on which data to exfiltrate, drafted extortion messages, and even suggested ransom demands—sometimes recommending amounts of more than $500,000.

“The actor used AI to what we believe is an unprecedented degree,” wrote Anthropic in its announcement last week. “This represents an evolution in AI-assisted cybercrime.”

Autonomous AI Hacker Assistants

Cybercriminals have been experimenting with AI for years. What makes “vibe hacking” different is that the technology is now doing most of the hard work for them.

Anthropic’s investigation revealed that malicious actors have been using Claude Code in several ways: developing malware, guiding attackers step by step during live operations, organizing and analyzing massive troves of stolen data, and even automating extortion messages tailored to each victim’s vulnerabilities.

In one case, a user in the United Kingdom managed to get Claude to build software—and not just any software, but a commercial ransomware product. The AI model generated a ransomware-as-a-service (RaaS) platform designed to help the user sell ransomware through forums such as CryptBB, Dread, and Nulle, known for enabling illegal activities.

The most shocking part? The user didn’t seem to fully understand what they were doing, as they frequently requested assistance from Anthropic’s AI system.

“The operation encompasses the development of multiple ransomware variants featuring ChaCha20 encryption, anti-EDR techniques, and Windows internal exploitations,” states the study. “Most concerning is the actor’s apparent dependency on AI—they appear unable to implement complex technical components or troubleshoot issues without AI assistance, yet are selling capable malware.”

Tasks that once required teams of skilled hackers months—or even years—to complete are now being handled by AI models, which can assist a lone cybercriminal through every stage of the process.

AI Systems Manipulated And Used As Weapons

The harmful impact of AI models on humans has already become a serious and urgent concern in recent months, from AI-linked psychosis and suicides to growing patterns of addiction. But while much attention has been focused on how AI harms people, less focus has been placed on the reverse: how people can manipulate AI models, and in turn, use them to harm others.

A few days ago, researchers from the University of Pennsylvania published a study revealing that AI models are alarmingly vulnerable to persuasion and flattery. They found that models such as OpenAI’s GPT-4o mini can fall prey to social engineering tactics and exhibit “para-human” behavior—meaning that, because they are trained on human behavior, they also replicate human weaknesses when it comes to manipulation.

GPT-4o fell for popular principles of persuasion humans usually fall for and revealed information that it is not allowed to share—data that remained inaccessible through more traditional prompts.

Anthropic, for its part, did not disclose the specific prompts hackers used to jailbreak its AI agent, nor did it detail exactly how the system was manipulated into assisting sophisticated cyberattacks. Still, recent studies suggest these models may be far more vulnerable than most people assume. With luck—fingers crossed—the vulnerabilities now documented will no longer be exploitable.

From Writing An Essay To Hacking International Organizations

Remember when the biggest worry about chatbots was students using them to cheat on essays? Well, the new era of AI misuse has officially arrived—one where these models can be weaponized for malicious activities with far greater impact.

Bad actors are now using AI models as co-pilots for sophisticated cyberattacks—no technical expertise required.

Anthropic has assured the public that it patched the vulnerabilities, reduced the risks, and strengthened safety measures to prevent similar abuse. Yet the company also admitted it cannot predict how future users—or other AI models—might be exploited. The risk will always be there.

“This isn’t just Claude,” said one of Anthropic’s employees in the video announcement for the new vibe hacking threat. “This is all LLMs presumably.”

We are still in the stage of recognizing vibe hacking, and with each passing minute, the risk of this trend spreading seems to increase. Some experts suggest that the solution lies in using more AI for defense and putting all efforts into mitigation. But is this strategy really sustainable in the long term? A war of AIs against AIs seems to be beginning.