BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents
security challenges that go beyond traditional web application threat models. Prior work has identified prompt injection as a new attack vector for web agents, yet the resulting impact within real
Agent Privilege Separation in OpenClaw: A Structural Defense Against Prompt Injection
Prompt injection remains one of the most practical attack vectors against LLM-integrated applications. We replicate the Microsoft LLMail-Inject benchmark (Greshake et al., 2024) against current generation models running
JSONalyzeQueryEngine` in the run-llama/llama_index repository allows for SQL injection via prompt injection. This can lead to arbitrary file creation and Denial-of-Service (DoS) attacks. The vulnerability affects
A Framework for Formalizing LLM Agent Security
executes a user task. Using this framework, we reformalize existing attacks, such as indirect prompt injection, direct prompt injection, jailbreak, task drift, and memory poisoning, as violations
AlignSentinel: Alignment-Aware Detection of Prompt Injection Attacks
Prompt injection attacks insert malicious instructions into an LLM's input to steer it toward an attacker-chosen task instead of the intended one. Existing detection defenses typically classify
Securing AI Agents Against Prompt Injection Attacks
used for enhancing large language model capabilities, but they introduce significant security vulnerabilities through prompt injection attacks. We present a comprehensive benchmark for evaluating prompt injection risks in RAG-enabled
Large Language Models for Cyber Security
paper studies the architecture and functioning of LLMs, its integration into Encrypted prompts to prevent prompt injection attacks. It also studies the integration of LLMs into cybersecurity tools using
ChatInject: Abusing Chat Templates for Prompt Injection in LLM Agents
environments has created new attack surfaces for adversarial manipulation. One major threat is indirect prompt injection, where attackers embed malicious instructions in external environment output, causing agents to interpret
Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks
domains, it creates an increasingly complex agent supply chain, offering new surfaces for prompt injection attacks. We identify skill-based prompt injection as a significant threat and introduce SkillInject
Prompt Injection Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching
Prompt injection remains a central obstacle to the safe deployment of large language models, particularly in multi-agent settings where intermediate outputs can propagate or amplify malicious instructions. Building
Prompt Control-Flow Integrity: A Priority-Aware Runtime Defense Against Prompt Injection in LLM Systems
models (LLMs) deployed behind APIs and retrieval-augmented generation (RAG) stacks are vulnerable to prompt injection attacks that may override system policies, subvert intended behavior, and induce unsafe outputs. Existing
Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks?
agents are vulnerable to indirect prompt injection attacks, where malicious instructions embedded in external content or tool outputs cause unintended or harmful behavior. Inspired by the well-established concept
VeriGrey: Greybox Agent Validation
behavior. As mutation operators in the testing process, we mutate prompts to design pernicious injection prompts. This is carefully accomplished by linking the task of the agent to an injection
Defense Against Indirect Prompt Injection via Tool Result Parsing
malicious instructions via prompt engineering. Despite their flexibility, most current prompt-based defenses suffer from high Attack Success Rates (ASR), demonstrating limited robustness against sophisticated injection attacks. In this paper
Unified Threat Detection and Mitigation Framework (UTDMF): Combating Prompt Injection, Deception, and Bias in Enterprise-Scale Transformers
rapid adoption of large language models (LLMs) in enterprise systems exposes vulnerabilities to prompt injection attacks, strategic deception, and biased outputs, threatening security, trust, and fairness. Extending our adversarial activation
Extended to Reality: Prompt Injection in 3D Environments
objects in the environment to override MLLMs' intended task. While prior work has studied prompt injection in the text domain and through digitally edited 2D images, it remains unclear
Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions
text to power applications, but this integration introduces new vulnerabilities. We study Image-based Prompt Injection (IPI), a black-box attack in which adversarial instructions are embedded into natural images
In-Browser LLM-Guided Fuzzing for Real-Time Prompt Injection Testing in Agentic AI Browsers
browsers) offer powerful automation of web tasks. However, they are vulnerable to indirect prompt injection attacks, where malicious instructions hidden in a webpage deceive the agent into unwanted actions. These
Prompt Injection as an Emerging Threat: Evaluating the Resilience of Large Language Models
while powerful, also makes them vulnerable to a new class of attacks known as prompt injection. In these attacks, hidden or malicious instructions are inserted into user inputs or external
ceLLMate: Sandboxing Browser AI Agents
across pages. While these agents help automate repetitive online tasks, they are vulnerable to prompt injection attacks that trick an agent into performing undesired actions, such as leaking private information