Agent web browsers that leverage artificial intelligence (AI) capabilities to autonomously execute tasks across multiple websites on behalf of a user can be trained and tricked into falling into phishing and scam traps.
The attack, at its core, takes advantage of AI browsers’ tendency to rationalize their actions and uses it against models to circumvent their security guardrails, Guardio said in a report shared with The Hacker News ahead of publication.
Security researcher Shaked Chen said, “AI now works in real-time, inside disorganized and dynamic pages, while constantly requesting information, making decisions and describing its actions along the way. Well, ‘describe’ is quite an understatement – it babbles, and very much!”
“This is what we call agent nonsense: The AI browser highlights what it sees, what it believes is happening, what it plans to do next, and which signals it considers suspicious or safe.”
By intercepting this traffic between the browser and AI services running on the vendor’s servers and feeding it as input into a generative adversarial network (GAN), Guardio said it was able to make Perplexity’s Comet AI browser a victim of a phishing scam in less than four minutes.
This research builds on prior techniques such as Vibescamming and Scamlexity, which found that Vibe-coding platforms and AI browsers can be tricked into creating scam pages or carrying out malicious actions through hidden quick injections. In other words, when the AI agent handles tasks without constant human supervision, there is a change in the attack surface in which the scam no longer has to deceive the user. Rather, its goal is to trick the AI model itself.
“If you can see what the agent considers suspicious, what he hesitates about, and more importantly, what he thinks and thinks about the page, you can use that as a training signal,” Chen explains. “The scam evolves until the AI browser reliably falls into the trap of another AI set for it.”
In short, the idea is to create a “scamming machine” that adapts and reproduces the phishing page until the agent browser stops complaining and moves on to do the threat actor’s bidding, such as entering the victim’s credentials on a fake web page designed to carry out a refund scam.
What makes this attack interesting and dangerous is that once the fraudster replicates it on a web page until it works against a specific AI browser, it works on all users who trust the same agent. In other words, the target has shifted from the human user to the AI browser.
Guardio said, “This reflects the unfortunate situation we are facing in the near future: Scams will not just be launched and adjusted in the wild, they will be trained offline, against exactly the same models that millions of people trust, until they work flawlessly on first contact.” “Because when your AI tells the browser why it shut down, it teaches attackers how to bypass it.”
The disclosure came after Trail of Bits demonstrated four rapid injection techniques against the Comet browser, exploiting the browser’s AI assistant to extract users’ private information from services like Gmail and sending the data to the attacker’s servers when the user asks to summarize a web page under their control.
Last week, Zenity Labs also detailed two zero-click attacks affecting Perplexity’s Comet, which used indirect signal injection within meeting invitations to infiltrate local files to an external server (aka PerplexedComet) or hijack a user’s 1Password account when the password manager extension is installed and unlocked. The issues, collectively codenamed PerplexedBrowser, have since been addressed by the AI company.
This is achieved through a rapid injection technique known as intent collision, which occurs “when the agent merges a benign user request with attacker-controlled instructions from untrusted web data into an execution plan, without a reliable way to distinguish between the two,” said security researcher Stav Cohen.
Prompt injection attacks remain a fundamental security challenge for large language models (LLMs) and for integrating them into organizational workflows, primarily because it may not be possible to completely eliminate these vulnerabilities. In December 2025, OpenAI noted that such vulnerabilities are “unlikely” to be fully resolved in agentive browsers, although related risks can be mitigated through automated attack detection, adversarial training, and new system-level security measures.