AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 801–820 of 973 papers

Clear filters

Attack MEDIUM

Collaborative penetration testing suite for emerging generative AI algorithms

Petar Radanliev

Problem Space: AI Vulnerabilities and Quantum Threats Generative AI vulnerabilities: model inversion, data poisoning, adversarial inputs. Quantum...

5 months ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

OpenGuardrails: A Configurable, Unified, and Scalable Guardrails Platform for Large Language Models

Thomas Wang, Haowen Li

As large language models (LLMs) are increasingly integrated into real-world applications, ensuring their safety, robustness, and privacy compliance...

5 months ago cs.CR cs.CL PDF

Benchmark MEDIUM

Exploring Membership Inference Vulnerabilities in Clinical Large Language Models

Alexander Nemecek, Zebin Yun, Zahra Rahmani +4 more

As large language models (LLMs) become progressively more embedded in clinical decision-support, documentation, and patient-information systems,...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Evaluating Large Language Models in detecting Secrets in Android Apps

Marco Alecci, Jordan Samhi, Tegawendé F. Bissyandé +1 more

Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services. However, developers often...

5 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation

Giovanni De Muri, Mark Vero, Robin Staab +1 more

LLMs are often used by downstream users as teacher models for knowledge distillation, compressing their capabilities into memory-efficient models....

5 months ago cs.LG cs.AI cs.CR PDF

Survey MEDIUM

The Attribution Story of WhisperGate: An Academic Perspective

Oleksandr Adamov, Anders Carlsson

This paper explores the challenges of cyberattack attribution, specifically APTs, applying the case study approach for the WhisperGate cyber...

5 months ago cs.CR PDF

Benchmark MEDIUM

DeepTx: Real-Time Transaction Risk Analysis via Multi-Modal Features and LLM Reasoning

Yixuan Liu, Xinlei Li, Yi Li

Phishing attacks in Web3 ecosystems are increasingly sophisticated, exploiting deceptive contract logic, malicious frontend scripts, and token...

5 months ago cs.CR PDF

Attack MEDIUM

Agentic Reinforcement Learning for Search is Unsafe

Yushi Yang, Shreyansh Padarha, Andrew Lee +1 more

Agentic reinforcement learning (RL) trains large language models to autonomously call tools during reasoning, with search as the most common...

5 months ago cs.CL PDF

Tool MEDIUM

Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems

Rishi Jha, Harold Triedman, Justin Wagle +1 more

Control-flow hijacking attacks manipulate orchestration mechanisms in multi-agent systems into performing unsafe actions that compromise the system...

5 months ago cs.LG cs.CR eess.SY PDF

Defense MEDIUM

Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses

Runlin Lei, Lu Yi, Mingguo He +4 more

While Graph Neural Networks (GNNs) and Large Language Models (LLMs) are powerful approaches for learning on Text-Attributed Graphs (TAGs), a...

5 months ago cs.LG PDF

Attack MEDIUM

Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models

Elias Hossain, Swayamjit Saha, Somshubhra Roy +1 more

Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference...

5 months ago cs.CR cs.AI PDF

Defense MEDIUM

SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents

Qiusi Zhan, Angeline Budiman-Chan, Abdelrahman Zayed +3 more

Large language model (LLM) based search agents iteratively generate queries, retrieve external information, and reason to answer open-domain...

5 months ago cs.CL PDF

Defense MEDIUM

SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents

Qiusi Zhan, Angeline Budiman-Chan, Abdelrahman Zayed +3 more

Large language model (LLM) based search agents iteratively generate queries, retrieve external information, and reason to answer open-domain...

5 months ago cs.CL PDF

Defense MEDIUM

Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations

Bo-Han Feng, Chien-Feng Liu, Yu-Hsuan Li Liang +9 more

Large audio-language models (LALMs) extend text-based LLMs with auditory understanding, offering new opportunities for multimodal applications. While...

5 months ago cs.SD cs.AI cs.CL PDF

Tool MEDIUM

When AI Takes the Wheel: Security Analysis of Framework-Constrained Program Generation

Yue Liu, Zhenchang Xing, Shidong Pan +1 more

In recent years, the AI wave has grown rapidly in software development. Even novice developers can now design and generate complex...

5 months ago cs.SE cs.CR PDF

Attack MEDIUM

Black-box Optimization of LLM Outputs by Asking for Directions

Jie Zhang, Meng Ding, Yang Liu +2 more

We present a novel approach for attacking black-box large language models (LLMs) by exploiting their ability to express confidence in natural...

5 months ago cs.CR cs.LG PDF

Attack MEDIUM

DistilLock: Safeguarding LLMs from Unauthorized Knowledge Distillation on the Edge

Asmita Mohanty, Gezheng Kang, Lei Gao +1 more

Large Language Models (LLMs) have demonstrated strong performance across diverse tasks, but fine-tuning them typically relies on cloud-based,...

5 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

The Chameleon Nature of LLMs: Quantifying Multi-Turn Stance Instability in Search-Enabled Language Models

Shivam Ratnakar, Sanjay Raghavendra

Integration of Large Language Models with search/retrieval engines has become ubiquitous, yet these systems harbor a critical vulnerability that...

5 months ago cs.CL cs.AI PDF

Tool MEDIUM

Toward Understanding Security Issues in the Model Context Protocol Ecosystem

Xiaofan Li, Xing Gao

The Model Context Protocol (MCP) is an emerging open standard that enables AI-powered applications to interact with external tools through structured...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

ATA: A Neuro-Symbolic Approach to Implement Autonomous and Trustworthy Agents

David Peer, Sebastian Stabinger

Large Language Models (LLMs) have demonstrated impressive capabilities, yet their deployment in high-stakes domains is hindered by inherent...

5 months ago cs.CL cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial