Search: prompt injection | AI Threat Intelligence

Severity:

298 results in 116ms

Paper 2510.05244v1

2025-10-06

Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks?

agents are vulnerable to indirect prompt injection attacks, where malicious instructions embedded in external content or tool outputs cause unintended or harmful behavior. Inspired by the well-established concept

high relevance benchmark

Paper 2510.09023v1

2025-10-10

The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections

should we evaluate the robustness of language model defenses? Current defenses against jailbreaks and prompt injections (which aim to prevent an attacker from eliciting harmful knowledge or remotely triggering malicious

high relevance attack

Paper 2601.04666v1

2026-01-08

Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning

model (LLM)-integrated applications have become increasingly prevalent, yet face critical security vulnerabilities from prompt injection (PI) attacks. Defending against PI attacks faces two major issues: malicious instructions

high relevance attack

Paper 2603.10749v1

2026-03-11

AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations

agents are highly vulnerable to Indirect Prompt Injection (IPI), where adversaries embed malicious directives in untrusted tool outputs to hijack execution. Most existing defenses treat IPI as an input-level

high relevance tool

Paper 2602.07918v1

2026-02-08

CausalArmor: Efficient Indirect Prompt Injection Guardrails via Causal Attribution

agents equipped with tool-calling capabilities are susceptible to Indirect Prompt Injection (IPI) attacks. In this attack scenario, malicious commands hidden within untrusted content trick the agent into performing unauthorized

high relevance attack

Paper 2602.01795v1

2026-02-02

RedVisor: Reasoning-Aware Prompt Injection Defense via Zero-Copy KV Cache Reuse

Large Language Models (LLMs) are increasingly vulnerable to Prompt Injection (PI) attacks, where adversarial instructions hidden within retrieved contexts hijack the model's execution flow. Current defenses typically face

high relevance attack

Paper 2602.22242v1

2026-02-24

Analysis of LLMs Against Prompt Injection and Jailbreak Attacks

same time, LLMs are vulnerable to prompt-based attacks. Thus, analyzing this risk has become a critical security requirement. This work evaluates prompt-injection and jailbreak vulnerability using a large

high relevance attack

Paper 2512.05745v1

2025-12-05

ARGUS: Defending Against Multimodal Indirect Prompt Injection via Steering Instruction-Following Behavior

Multimodal Large Language Models (MLLMs) are increasingly vulnerable to multimodal Indirect Prompt Injection (IPI) attacks, which embed malicious instructions in images, videos, or audio to hijack model behavior. Existing defenses

high relevance attack

Paper 2602.20708v1

2026-02-24

ICON: Indirect Prompt Injection Defense for Agents based on Inference-Time Correction

Large Language Model (LLM) agents are susceptible to Indirect Prompt Injection (IPI) attacks, where malicious instructions in retrieved content hijack the agent's execution. Existing defenses typically rely on strict

high relevance attack

Paper 2512.16307v1

2025-12-18

Beyond the Benchmark: Innovative Defenses Against Prompt Injection Attacks

fast-evolving area of LLMs, our paper discusses the significant security risk presented by prompt injection attacks. It focuses on small open-sourced models, specifically the LLaMA family of models

high relevance benchmark

Paper 2510.04261v1

2025-10-05

VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy

integrated applications? To address this question, we propose \textsc{VortexPIA}, a novel indirect prompt injection attack that induces privacy extraction in LLM-integrated applications under black-box settings. By injecting

high relevance attack

CVE CRITICAL CVE-2023-29374

2023-04-05

LangChain through 0.0.131, the LLMMathChain chain allows prompt injection attacks that can execute arbitrary code via the Python exec method

CVSS 9.8 langchain View details

Paper 2512.01326v1

2025-12-01

Securing Large Language Models (LLMs) from Prompt Injection Attacks

increasingly being deployed in real-world applications, but their flexibility exposes them to prompt injection attacks. These attacks leverage the model's instruction-following ability to make it perform malicious

high relevance attack

Paper 2601.19051v1

2026-01-27

Proactive Hardening of LLM Defenses with HASTE

enhance detection efficacy for prompt-based attack techniques. The framework is agnostic to synthetic data generation methods, and can be generalized to evaluate prompt-injection detection efficacy, with and without

medium relevance defense

Paper 2510.09093v1

2025-10-10

Exploiting Web Search Tools of AI Agents for Data Exfiltration

functionality and vulnerability to abuse. As LLMs increasingly interact with external data sources, indirect prompt injection emerges as a critical and evolving attack vector, enabling adversaries to exploit models through

high relevance tool

Paper 2510.09849v1

2025-10-10

Text Prompt Injection of Vision Language Models

vision language models has significantly raised safety concerns. In this project, we investigate text prompt injection, a simple yet effective method to mislead these models. We developed an algorithm

high relevance attack

Paper 2512.10104v2

2025-12-10

Phishing Email Detection Using Large Language Models

based framework to detect phishing email attacks across multiple attack vectors, including prompt injection, text refinement, and multilingual attacks. We evaluate three frontier LLMs (e.g., GPT-4o, Claude Sonnet

medium relevance defense

Paper 2602.20064v1

2026-02-23

The LLMbda Calculus: AI Agents, Conversations, and Information Flow

prompt, and parses the response as a new term. This calculus faithfully represents planner loops and their vulnerabilities, including the mechanisms by which prompt injection alters subsequent computation. The semantics

medium relevance attack

Paper 2602.22724v1

2026-02-26

AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification

retrieval systems to autonomously complete complex tasks. However, this design exposes agents to indirect prompt injection (IPI), where attacker-controlled context embedded in tool outputs or retrieved content silently steers

high relevance attack

Paper 2510.08829v1

2025-10-09

CommandSans: Securing AI Agents with Surgical Precision Prompt Sanitization

access to numerous tools and sensitive data significantly widens the attack surface for indirect prompt injections. Due to the context-dependent nature of attacks, however, current defenses are often

medium relevance benchmark

Previous Page 5 of 15 Next