AI Security Research

2,104+ academic papers on AI security, attacks, and defenses

Total

2,104

Attack

820

Benchmark

609

Defense

276

Tool

229

Survey

116

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1221–1240 of 2,104 papers

Attack HIGH

Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs

Yinan Zhong, Qianhao Miao, Yanjiao Chen +3 more

Large Language Models (LLMs) have been integrated into many applications (e.g., web agents) to perform more sophisticated tasks. However,...

3 months ago cs.CR PDF

Other MEDIUM

USCSA: Evolution-Aware Security Analysis for Proxy-Based Upgradeable Smart Contracts

Xiaoqi Li, Lei Xie, Wenkai Li +1 more

In the case of upgrading smart contracts on blockchain systems, it is essential to consider the continuity of upgrades and subsequent maintenance. In...

3 months ago cs.CR PDF

Survey MEDIUM

Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem

Shiva Gaire, Srijan Gyawali, Saroj Mishra +3 more

The Model Context Protocol (MCP) has emerged as the de facto standard for connecting Large Language Models (LLMs) to external data and tools,...

3 months ago cs.CR cs.AI PDF

Attack HIGH

MIRAGE: Misleading Retrieval-Augmented Generation via Black-box and Query-agnostic Poisoning Attacks

Tailun Chen, Yu He, Yan Wang +9 more

Retrieval-Augmented Generation (RAG) systems enhance LLMs with external knowledge but introduce a critical attack surface: corpus poisoning. While...

3 months ago cs.CR PDF

Attack HIGH

How a Bit Becomes a Story: Semantic Steering via Differentiable Fault Injection

Zafaryab Haider, Md Hafizur Rahman, Shane Moeykens +2 more

Hard-to-detect hardware bit flips, from either malicious circuitry or bugs, have already been shown to make transformers vulnerable in non-generative...

3 months ago cs.LG cs.AI PDF

Benchmark MEDIUM

Secure or Suspect? Investigating Package Hallucinations of Shell Command in Original and Quantized LLMs

Md Nazmul Haque, Elizabeth Lin, Lawrence Arkoh +2 more

Large Language Models for code (LLMs4Code) are increasingly used to generate software artifacts, including library and package recommendations in...

3 months ago cs.SE PDF

Tool HIGH

A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties

Jinghao Wang, Ping Zhang, Carter Yagemann

Medical Large Language Models (LLMs) are increasingly deployed for clinical decision support across diverse specialties, yet systematic evaluation of...

3 months ago cs.CR cs.AI PDF

Attack LOW

Universal Adversarial Suffixes for Language Models Using Reinforcement Learning with Calibrated Reward

Sampriti Soor, Suklav Ghosh, Arijit Sur

Language models are vulnerable to short adversarial suffixes that can reliably alter predictions. Previous works usually find such suffixes with...

3 months ago cs.CL PDF

Attack HIGH

Detecting Ambiguity Aversion in Cyberattack Behavior to Inform Cognitive Defense Strategies

Stephan Carney, Soham Hans, Sofia Hirschmann +4 more

Adversaries (hackers) attempting to infiltrate networks frequently face uncertainty in their operational environments. This research explores the...

3 months ago cs.CR cs.HC PDF

Benchmark MEDIUM

An Adaptive Multi-Layered Honeynet Architecture for Threat Behavior Analysis via Deep Learning

Lukas Johannes Möller

The escalating sophistication and variety of cyber threats have rendered static honeypots inadequate, necessitating adaptive, intelligence-driven...

3 months ago cs.CR cs.DC cs.LG PDF

Benchmark MEDIUM

Auditing Games for Sandbagging

Jordan Taylor, Sid Black, Dillon Bowen +10 more

Future AI systems could conceal their capabilities ('sandbagging') during evaluations, potentially misleading developers and auditors. We...

3 months ago cs.AI PDF

Attack HIGH

TROJail: Trajectory-Level Optimization for Multi-Turn Large Language Model Jailbreaks with Process Rewards

Xiqiao Xiong, Ouxiang Li, Zhuo Liu +5 more

Large language models have seen widespread adoption, yet they remain vulnerable to multi-turn jailbreak attacks, threatening their safe deployment....

3 months ago cs.AI cs.LG PDF

Benchmark LOW

SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination

Sangha Park, Seungryong Yoo, Jisoo Mok +1 more

Although Multimodal Large Language Models (MLLMs) have advanced substantially, they remain vulnerable to object hallucination caused by language...

3 months ago cs.CV cs.AI PDF

Benchmark LOW

Privacy Practices of Browser Agents

Alisha Ukani, Hamed Haddadi, Ali Shahin Shamsabadi +1 more

This paper presents a systematic evaluation of the privacy behaviors and attributes of eight recent, popular browser agents. Browser agents are...

3 months ago cs.CR PDF

Benchmark MEDIUM

How Do LLMs Fail In Agentic Scenarios? A Qualitative Analysis of Success and Failure Scenarios of Various LLMs in Agentic Simulations

JV Roig

We investigate how large language models (LLMs) fail when operating as autonomous agents with tool-use capabilities. Using the Kamiwaza Agentic Merit...

3 months ago cs.AI cs.SE PDF

Defense LOW

Amulet: Fast TEE-Shielded Inference for On-Device Model Protection

Zikai Mao, Lingchen Zhao, Lei Xu +4 more

On-device machine learning (ML) introduces new security concerns about model privacy. Storing valuable trained ML models on user devices exposes them...

3 months ago cs.CR PDF

Attack LOW

AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing

Ziming Hong, Tianyu Huang, Runnan Chen +4 more

Recent studies have extended diffusion-based instruction-driven 2D image editing pipelines to 3D Gaussian Splatting (3DGS), enabling faithful...

3 months ago cs.CV cs.CR cs.LG PDF

Benchmark MEDIUM

Pay Less Attention to Function Words for Free Robustness of Vision-Language Models

Qiwei Tian, Chenhao Lin, Zhengyu Zhao +1 more

To address the trade-off between robustness and performance for robust VLM, we observe that function words could incur vulnerability of VLMs against...

3 months ago cs.LG cs.CL PDF

Attack HIGH

Response-Based Knowledge Distillation for Multilingual Jailbreak Prevention Unwittingly Compromises Safety

Max Zhang, Derek Liu, Kai Zhang +2 more

Large language models (LLMs) are increasingly deployed worldwide, yet their safety alignment remains predominantly English-centric. This allows for...

3 months ago cs.CL PDF

Tool MEDIUM

Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models

Fenghua Weng, Chaochao Lu, Xia Hu +2 more

As multimodal reasoning improves the overall capabilities of Large Vision Language Models (LVLMs), recent studies have begun to explore...

3 months ago cs.CV cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial