AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 61–80 of 798 papers

Clear filters

Attack MEDIUM

Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models

Ali Raza, Gurang Gupta, Nikolay Matyunin +1 more

Warning: This article includes red-teaming experiments, which contain examples of compromised LLM responses that may be offensive or upsetting. Large...

2 weeks ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models

Quanchen Zou, Moyang Chen, Zonghao Ying +6 more

Large Vision-Language Models (LVLMs) undergo safety alignment to suppress harmful content. However, current defenses predominantly target explicit...

2 weeks ago cs.CR PDF

Attack MEDIUM

AgenticCyOps: Securing Multi-Agentic AI Integration in Enterprise Cyber Operations

Shaswata Mitra, Raj Patel, Sudip Mittal +2 more

Multi-agent systems (MAS) powered by LLMs promise adaptive, reasoning-driven enterprise workflows, yet granting agents autonomous control over tools,...

2 weeks ago cs.CR cs.MA cs.SE PDF

Attack HIGH

NetDiffuser: Deceiving DNN-Based Network Attack Detection Systems with Diffusion-Generated Adversarial Traffic

Pratyay Kumar, Abu Saleh Md Tayeen, Satyajayant Misra +4 more

Deep learning (DL)-based Network Intrusion Detection System (NIDS) has demonstrated great promise in detecting malicious network traffic. However,...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

David Fernandez, Pedram MohajerAnsari, Amir Salarpour +3 more

Vision-language models are emerging for autonomous driving, yet their robustness to physical adversarial attacks remains unexplored. This paper...

2 weeks ago cs.CV PDF

Attack MEDIUM

LLM-Agent Interactions on Markets with Information Asymmetries

Alexander Erlei, Lukas Meub

As AI agents increasingly act on behalf of human stakeholders in economic settings, understanding their behavior in complex market environments...

2 weeks ago econ.GN PDF

Attack HIGH

SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

Junxian Li, Tu Lan, Haozhen Tan +2 more

Modern vision-language-model (VLM) based graphical user interface (GUI) agents are expected not only to execute actions accurately but also to...

2 weeks ago cs.CR cs.CL cs.CV PDF

Attack HIGH

The Struggle Between Continuation and Refusal: A Mechanistic Analysis of the Continuation-Triggered Jailbreak in LLMs

Yonghong Deng, Zhen Yang, Ping Jian +3 more

With the rapid advancement of large language models (LLMs), the safety of LLMs has become a critical concern. Despite significant efforts in safety...

2 weeks ago cs.AI cs.LG PDF

Attack MEDIUM

Detecting Cryptographically Relevant Software Packages with Collaborative LLMs

Eduard Hirsch, Kristina Raab, Tobias J. Bauer +1 more

IT systems are facing an increasing number of security threats, including advanced persistent attacks and future quantum-computing vulnerabilities....

2 weeks ago cs.CR cs.IR PDF

Attack HIGH

Targeted Bit-Flip Attacks on LLM-Based Agents

Jialai Wang, Ya Wen, Zhongmou Liu +4 more

Targeted bit-flip attacks (BFAs) exploit hardware faults to manipulate model parameters, posing a significant security threat. While prior work...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents

Ondřej Lukáš, Jihoon Shin, Emilia Rivas +6 more

Autonomous offensive agents often fail to transfer beyond the networks on which they are trained. We isolate a minimal but fundamental shift --...

2 weeks ago cs.CR cs.LG PDF

Attack MEDIUM

SPOILER: TEE-Shielded DNN Partitioning of On-Device Secure Inference with Poison Learning

Donghwa Kang, Hojun Choe, Doohyun Kim +2 more

Deploying deep neural networks (DNNs) on edge devices exposes valuable intellectual property to model-stealing attacks. While TEE-shielded DNN...

2 weeks ago cs.CR PDF

Attack HIGH

Depth Charge: Jailbreak Large Language Models from Deep Safety Attention Heads

Jinman Wu, Yi Xie, Shiqian Zhao +1 more

Currently, open-sourced large language models (OSLLMs) have demonstrated remarkable generative performance. However, as their structure and weights...

2 weeks ago cs.CR cs.AI PDF

Attack MEDIUM

Good-Enough LLM Obfuscation (GELO)

Anatoly Belikov, Ilya Fedotov

Large Language Models (LLMs) are increasingly served on shared accelerators where an adversary with read access to device memory can observe KV...

2 weeks ago cs.CR cs.LG PDF

Attack HIGH

Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models

Yuanbo Li, Tianyang Xu, Cong Hu +3 more

The rapid progress of Multi-Modal Large Language Models (MLLMs) has significantly advanced downstream applications. However, this progress also...

2 weeks ago cs.CV PDF

Attack HIGH

Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models

Yuanbo Li, Tianyang Xu, Cong Hu +3 more

The rapid progress of Multi-Modal Large Language Models (MLLMs) has significantly advanced downstream applications. However, this progress also...

2 weeks ago cs.CV PDF

Attack MEDIUM

Efficient Refusal Ablation in LLM through Optimal Transport

Geraldin Nanfack, Eugene Belilovsky, Elvis Dohmatob

Safety-aligned language models refuse harmful requests through learned refusal behaviors encoded in their internal representations. Recent...

3 weeks ago cs.LG cs.AI PDF

Attack LOW

Bayesian Adversarial Privacy

Cameron Bell, Timothy Johnston, Antoine Luciano +1 more

Theoretical and applied research into privacy encompasses an incredibly broad swathe of differing approaches, emphasis and aims. This work introduces...

3 weeks ago math.ST cs.CR cs.LG PDF

Attack MEDIUM

From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration

Yizhe Xie, Congcong Zhu, Xinyue Zhang +5 more

Large Language Model-based Multi-Agent Systems (LLM-MAS) are increasingly applied to complex collaborative scenarios. However, their collaborative...

3 weeks ago cs.MA cs.AI PDF

Attack HIGH

When Safety Becomes a Vulnerability: Exploiting LLM Alignment Homogeneity for Transferable Blocking in RAG

Junchen Li, Chao Qi, Rongzheng Wang +5 more

Retrieval-Augmented Generation (RAG) enhances the capabilities of large language models (LLMs) by incorporating external knowledge, but its reliance...

3 weeks ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial