AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 381–400 of 986 papers

Clear filters

Survey MEDIUM

Adversarial Defense in Vision-Language Models: An Overview

Xiaowei Fu, Lei Zhang

The widespread use of Vision Language Models (VLMs, e.g. CLIP) has raised concerns about their vulnerability to sophisticated and imperceptible...

2 months ago cs.CV cs.AI PDF

Survey MEDIUM

De-Anonymization at Scale via Tournament-Style Attribution

Lirui Zhang, Huishuai Zhang

As LLMs rapidly advance and enter real-world use, their privacy implications are increasingly important. We study an authorship de-anonymization...

2 months ago cs.CR cs.CL cs.LG PDF

Benchmark MEDIUM

Efficient Privacy-Preserving Retrieval Augmented Generation with Distance-Preserving Encryption

Huanyi Ye, Jiale Guo, Ziyao Liu +1 more

RAG has emerged as a key technique for enhancing response quality of LLMs without high computational cost. In traditional architectures, RAG services...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Multimodal Generative Engine Optimization: Rank Manipulation for Vision-Language Model Rankers

Yixuan Du, Chenxiao Yu, Haoyan Xu +3 more

Vision-Language Models (VLMs) are rapidly replacing unimodal encoders in modern retrieval and recommendation systems. While their capabilities are...

2 months ago cs.CL cs.AI cs.LG PDF

Benchmark MEDIUM

Less Is More -- Until It Breaks: Security Pitfalls of Vision Token Compression in Large Vision-Language Models

Xiaomei Zhang, Zhaoxi Zhang, Leo Yu Zhang +3 more

Visual token compression is widely adopted to improve the inference efficiency of Large Vision-Language Models (LVLMs), enabling their deployment in...

2 months ago cs.CR cs.AI PDF

Tool MEDIUM

Taming Various Privilege Escalation in LLM-Based Agent Systems: A Mandatory Access Control Framework

Zimo Ji, Daoyuan Wu, Wenyuan Jiang +5 more

Large Language Model (LLM)-based agent systems are increasingly deployed for complex real-world tasks but remain vulnerable to natural language-based...

2 months ago cs.CR PDF

Benchmark MEDIUM

Gradient Structure Estimation under Label-Only Oracles via Spectral Sensitivity

Jun Liu, Leo Yu Zhang, Fengpeng Li +2 more

Hard-label black-box settings, where only top-1 predicted labels are observable, pose a fundamentally constrained yet practically important feedback...

2 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

Gradient Structure Estimation under Label-Only Oracles via Spectral Sensitivity

Jun Liu, Leo Yu Zhang, Fengpeng Li +2 more

Hard-label black-box settings, where only top-1 predicted labels are observable, pose a fundamentally constrained yet practically important feedback...

2 months ago cs.LG cs.CR PDF

Attack MEDIUM

Building Production-Ready Probes For Gemini

János Kramár, Joshua Engels, Zheng Wang +4 more

Frontier language model capabilities are improving rapidly. We thus need stronger mitigations against bad actors misusing increasingly powerful...

2 months ago cs.LG cs.AI cs.CL PDF

Attack MEDIUM

LoRA as Oracle

Marco Arazzi, Antonino Nocera

Backdoored and privacy-leaking deep neural networks pose a serious threat to the deployment of machine learning systems in security-critical...

2 months ago cs.CR cs.AI PDF

Tool MEDIUM

Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents

Kaiyu Zhou, Yongsen Zheng, Yicheng He +5 more

The agent--tool interaction loop is a critical attack surface for modern Large Language Model (LLM) agents. Existing denial-of-service (DoS) attacks...

2 months ago cs.CR cs.AI PDF

Tool MEDIUM

SecMLOps: A Comprehensive Framework for Integrating Security Throughout the MLOps Lifecycle

Xinrui Zhang, Pincan Zhao, Jason Jaskolka +2 more

Machine Learning (ML) has emerged as a pivotal technology in the operation of large and complex systems, driving advancements in fields such as...

2 months ago cs.CR cs.SE PDF

Attack MEDIUM

The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models

Christina Lu, Jack Gallagher, Jonathan Michala +2 more

Large language models can represent a variety of personas but typically default to a helpful Assistant identity cultivated during post-training. We...

2 months ago cs.CL PDF

Survey MEDIUM

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

Yi Liu, Weizhe Wang, Ruitao Feng +5 more

The rise of AI agent frameworks has introduced agent skills, modular packages containing instructions and executable code that dynamically extend...

2 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

The Straight and Narrow: Do LLMs Possess an Internal Moral Path?

Luoming Hu, Jingjie Zeng, Liang Yang +1 more

Enhancing the moral alignment of Large Language Models (LLMs) is a critical challenge in AI safety. Current alignment techniques often act as...

2 months ago cs.CL PDF

Tool MEDIUM

ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

Yutao Mou, Zhangchi Xue, Lijun Li +4 more

While LLM-based agents can interact with environments via invoking external tools, their expanded capabilities also amplify security risks....

2 months ago cs.CL PDF

Defense MEDIUM

Understanding and Preserving Safety in Fine-Tuned LLMs

Jiawen Zhang, Yangfan Hu, Kejia Chen +7 more

Fine-tuning is an essential and pervasive functionality for applying large language models (LLMs) to downstream tasks. However, it has the potential...

2 months ago cs.LG cs.AI PDF

Survey MEDIUM

SoK: Privacy-aware LLM in Healthcare: Threat Model, Privacy Techniques, Challenges and Recommendations

Mohoshin Ara Tahera, Karamveer Singh Sidhu, Shuvalaxmi Dass +1 more

Large Language Models (LLMs) are increasingly adopted in healthcare to support clinical decision-making, summarize electronic health records (EHRs),...

2 months ago cs.CR cs.LG PDF

Tool MEDIUM

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

Hanna Foerster, Tom Blanchard, Kristina Nikolić +6 more

AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss....

2 months ago cs.AI PDF

Benchmark MEDIUM

Blue Teaming Function-Calling Agents

Greta Dolcetti, Giulio Zizzo, Sergio Maffeis

We present an experimental evaluation that assesses the robustness of four open source LLMs claiming function-calling capabilities against three...

2 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial