AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 181–200 of 226 papers

Clear filters

Tool HIGH

Sentra-Guard: A Multilingual Human-AI Framework for Real-Time Defense Against Adversarial LLM Jailbreaks

Md. Mehedi Hasan, Ziaur Rahman, Rafid Mostafiz +1 more

This paper presents a real-time modular defense system named Sentra-Guard. The system detects and mitigates jailbreak and prompt injection attacks...

5 months ago cs.CR cs.AI PDF

Tool MEDIUM

SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots

Adetayo Adebimpe, Helmut Neukirchen, Thomas Welsh

Honeypots are decoy systems used for gathering valuable threat intelligence or diverting attackers away from production systems. Maximising attacker...

5 months ago cs.CR cs.CL cs.LG PDF

Tool MEDIUM

A Reinforcement Learning Framework for Robust and Secure LLM Watermarking

Li An, Yujian Liu, Yepeng Liu +3 more

Watermarking has emerged as a promising solution for tracing and authenticating text generated by large language models (LLMs). A common approach to...

5 months ago cs.CR PDF

Tool MEDIUM

Adversarially-Aware Architecture Design for Robust Medical AI Systems

Alyssa Gerhart, Balaji Iyangar

Adversarial attacks pose a severe risk to AI systems used in healthcare, capable of misleading models into dangerous misclassifications that can...

5 months ago cs.LG cs.CR PDF

Tool LOW

LLM-Augmented Symbolic NLU System for More Reliable Continuous Causal Statement Interpretation

Xin Lian, Kenneth D. Forbus

Despite the broad applicability of large language models (LLMs), their reliance on probabilistic inference makes them vulnerable to errors such as...

5 months ago cs.CL cs.AI PDF

Tool MEDIUM

AegisMCP: Online Graph Intrusion Detection for Tool-Augmented LLMs on Edge Devices

Zhonghao Zhan, Amir Al Sadi, Krinos Li +1 more

In this work, we study security of Model Context Protocol (MCP) agent toolchains and their applications in smart homes. We introduce AegisMCP, a...

5 months ago cs.CR PDF

Tool MEDIUM

OpenGuardrails: A Configurable, Unified, and Scalable Guardrails Platform for Large Language Models

Thomas Wang, Haowen Li

As large language models (LLMs) are increasingly integrated into real-world applications, ensuring their safety, robustness, and privacy compliance...

5 months ago cs.CR cs.CL PDF

Tool HIGH

HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models

Sidhant Narula, Javad Rafiei Asl, Mohammad Ghasemigol +2 more

Large Language Models (LLMs) remain vulnerable to multi-turn jailbreak attacks. We introduce HarmNet, a modular framework comprising ThoughtNet, a...

5 months ago cs.CR cs.AI PDF

Tool HIGH

The Trust Paradox in LLM-Based Multi-Agent Systems: When Collaboration Becomes a Security Vulnerability

Zijie Xu, Minfeng Qi, Shiqing Wu +4 more

Multi-agent systems powered by large language models are advancing rapidly, yet the tension between mutual trust and security remains underexplored....

5 months ago cs.CR PDF

Tool HIGH

VERA-V: Variational Inference Framework for Jailbreaking Vision-Language Models

Qilin Liao, Anamika Lochab, Ruqi Zhang

Vision-Language Models (VLMs) extend large language models with visual reasoning, but their multimodal design also introduces new, underexplored...

5 months ago cs.CR cs.CL cs.CV PDF

Tool MEDIUM

Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems

Rishi Jha, Harold Triedman, Justin Wagle +1 more

Control-flow hijacking attacks manipulate orchestration mechanisms in multi-agent systems into performing unsafe actions that compromise the system...

5 months ago cs.LG cs.CR eess.SY PDF

Tool MEDIUM

When AI Takes the Wheel: Security Analysis of Framework-Constrained Program Generation

Yue Liu, Zhenchang Xing, Shidong Pan +1 more

In recent years, the AI wave has grown rapidly in software development. Even novice developers can now design and generate complex...

5 months ago cs.SE cs.CR PDF

Tool MEDIUM

Toward Understanding Security Issues in the Model Context Protocol Ecosystem

Xiaofan Li, Xing Gao

The Model Context Protocol (MCP) is an emerging open standard that enables AI-powered applications to interact with external tools through structured...

5 months ago cs.CR cs.AI PDF

Tool HIGH

Prompt injections as a tool for preserving identity in GAI image descriptions

Kate Glazko, Jennifer Mankoff

Generative AI risks such as bias and lack of representation impact people who do not interact directly with GAI systems, but whose content does:...

5 months ago cs.CR cs.CY PDF

Tool LOW

MirrorFuzz: Leveraging LLM and Shared Bugs for Deep Learning Framework APIs Fuzzing

Shiwen Ou, Yuwei Li, Lu Yu +6 more

Deep learning (DL) frameworks serve as the backbone for a wide range of artificial intelligence applications. However, bugs within DL frameworks can...

5 months ago cs.SE cs.CR PDF

Tool HIGH

Active Honeypot Guardrail System: Probing and Confirming Multi-Turn LLM Jailbreaks

ChenYu Wu, Yi Wang, Yang Liao

Large language models (LLMs) are increasingly vulnerable to multi-turn jailbreak attacks, where adversaries iteratively elicit harmful behaviors that...

5 months ago cs.CR cs.AI PDF

Tool HIGH

A Hard-Label Black-Box Evasion Attack against ML-based Malicious Traffic Detection Systems

Zixuan Liu, Yi Zhao, Zhuotao Liu +4 more

Machine Learning (ML)-based malicious traffic detection is a promising security paradigm. It outperforms rule-based traditional detection by...

5 months ago cs.CR PDF

Tool MEDIUM

Formalizing the Safety, Security, and Functional Properties of Agentic AI Systems

Edoardo Allegrini, Ananth Shreekumar, Z. Berkay Celik

Agentic AI systems, which leverage multiple autonomous agents and Large Language Models (LLMs), are increasingly used to address complex, multi-step...

5 months ago cs.AI cs.CR cs.MA PDF

Tool MEDIUM

Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training

Yisen Wang, Yichuan Mo, Hongjun Wang +2 more

Despite the rapid progress of neural networks, they remain highly vulnerable to adversarial examples, for which adversarial training (AT) is...

5 months ago cs.LG cs.AI cs.CR PDF

Tool MEDIUM

Protect: Towards Robust Guardrailing Stack for Trustworthy Enterprise LLM Systems

Karthik Avinash, Nikhil Pareek, Rishav Hada

The increasing deployment of Large Language Models (LLMs) across enterprise and mission-critical domains has underscored the urgent need for robust...

5 months ago cs.CL cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial