Indirect Prompt Injection Attack Lets Hackers Manipulate AI Agents via Hidden Web Content
Cybersecurity researchers have warned about a growing threat called Indirect Prompt Injection (IDPI), a technique that allows attackers to manipulate AI agents through hidden instructions embedded in normal web content. Unlike direct prompt injection—where a user intentionally inputs malicious commands—IDPI works silently in the background. Attackers hide instructions inside webpage elements such as HTML code, metadata, comments, or invisible text. When an AI tool processes that page for tasks like summarizing content or reviewing ads, it may unknowingly follow those hidden commands.


Researchers from Unit 42 confirmed that IDPI attacks are already occurring in real-world environments. Their analysis identified 22 different techniques used to create malicious payloads and revealed new attacker objectives, including the first known case of IDPI being used to bypass an AI-based advertisement review system. These attacks can lead to serious consequences, such as promoting phishing sites through SEO poisoning, exposing sensitive information, triggering unauthorized financial transactions, or even executing destructive server-side commands.


To avoid detection, attackers use multiple concealment methods like embedding commands in webpage footers, hiding them within HTML attributes, or making them invisible using CSS techniques such as zero font size or off-screen placement. Social engineering is also widely used to jailbreak AI systems by framing instructions as developer or administrator commands. Security experts recommend treating external web content as untrusted input, implementing strict validation, limiting AI privileges, and requiring user approval for sensitive actions to reduce the risk of such attacks.
NPAV offers a robust solution to combat cyber fraud. Protect yourself with our top-tier security product, FraudProtector.net