AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 381–400 of 809 papers

Clear filters

Attack MEDIUM

Rectifying Adversarial Examples Using Their Vulnerabilities

Fumiya Morimoto, Ryuto Morita, Satoshi Ono

Deep neural network-based classifiers are prone to errors when processing adversarial examples (AEs). AEs are minimally perturbed input data...

2 months ago cs.CR cs.LG cs.NE PDF

Attack HIGH

Overlooked Safety Vulnerability in LLMs: Malicious Intelligent Optimization Algorithm Request and its Jailbreak

Haoran Gu, Handing Wang, Yi Mei +2 more

The widespread deployment of large language models (LLMs) has raised growing concerns about their misuse risks and associated safety issues. While...

2 months ago cs.CR cs.CL PDF

Attack MEDIUM

The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition

Xiaoze Liu, Weichen Yu, Matt Fredrikson +2 more

The open-weight language model ecosystem is increasingly defined by model composition techniques (such as weight merging, speculative decoding, and...

2 months ago cs.LG cs.CL cs.CR PDF

Attack HIGH

Large Empirical Case Study: Go-Explore adapted for AI Red Team Testing

Manish Bhatt, Adrian Wood, Idan Habler +1 more

Production LLM agents with tool-using capabilities require security testing despite their safety training. We adapt Go-Explore to evaluate...

2 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

GCG Attack On A Diffusion LLM

Ruben Neyroud, Sam Corley

While most LLMs are autoregressive, diffusion-based LLMs have recently emerged as an alternative method for generation. Greedy Coordinate Gradient...

2 months ago cs.LG cs.CL cs.CR PDF

Attack LOW

Privacy-Preserving Semantic Communications via Multi-Task Learning and Adversarial Perturbations

Yalin E. Sagduyu, Tugba Erpek, Aylin Yener +1 more

Semantic communications conveys task-relevant meaning rather than focusing solely on message reconstruction, improving bandwidth efficiency and...

2 months ago cs.NI cs.AI cs.CR PDF

Attack MEDIUM

RAGPart & RAGMask: Retrieval-Stage Defenses Against Corpus Poisoning in Retrieval-Augmented Generation

Pankayaraj Pathmanathan, Michael-Andrei Panaitescu-Liess, Cho-Yu Jason Chiang +1 more

Retrieval-Augmented Generation (RAG) has emerged as a promising paradigm to enhance large language models (LLMs) with external knowledge, reducing...

2 months ago cs.IR PDF

Attack HIGH

Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race?

Yuan Xin, Dingfan Chen, Linyi Yang +2 more

As large language models (LLMs) are increasingly deployed, ensuring their safe use is paramount. Jailbreaking, adversarial prompts that bypass model...

2 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

RepetitionCurse: Measuring and Understanding Router Imbalance in Mixture-of-Experts LLMs under DoS Stress

Ruixuan Huang, Qingyue Wang, Hantao Huang +4 more

Mixture-of-Experts architectures have become the standard for scaling large language models due to their superior parameter efficiency. To...

2 months ago cs.CR cs.LG PDF

Attack HIGH

Breaking Audio Large Language Models by Attacking Only the Encoder: A Universal Targeted Latent-Space Audio Attack

Roee Ziv, Raz Lapid, Moshe Sipper

Audio-language models combine audio encoders with large language models to enable multimodal reasoning, but they also introduce new security...

2 months ago cs.SD cs.AI cs.CR PDF

Attack HIGH

RobustMask: Certified Robustness against Adversarial Neural Ranking Attack via Randomized Masking

Jiawei Liu, Zhuo Chen, Rui Zhu +4 more

Neural ranking models have achieved remarkable progress and are now widely deployed in real-world applications such as Retrieval-Augmented Generation...

2 months ago cs.CR cs.IR PDF

Attack HIGH

EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion

Zhen Liang, Hai Huang, Zhengkui Chen

Large language models (LLMs), such as ChatGPT, have achieved remarkable success across a wide range of fields. However, their trustworthiness remains...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Adaptive Trust Consensus for Blockchain IoT: Comparing RL, DRL, and MARL Against Naive, Collusive, Adaptive, Byzantine, and Sleeper Attacks

Soham Padia, Dhananjay Vaidya, Ramchandra Mangrulkar

Securing blockchain-enabled IoT networks against sophisticated adversarial attacks remains a critical challenge. This paper presents a trust-based...

2 months ago cs.CR cs.LG cs.MA PDF

Attack HIGH

Backdoor Attacks on Prompt-Driven Video Segmentation Foundation Models

Zongmin Zhang, Zhen Sun, Yifan Liao +5 more

Prompt-driven Video Segmentation Foundation Models (VSFMs) such as SAM2 are increasingly deployed in applications like autonomous driving and digital...

3 months ago cs.CV cs.CR PDF

Attack LOW

Look Closer! An Adversarial Parametric Editing Framework for Hallucination Mitigation in VLMs

Jiayu Hu, Beibei Li, Jiangwei Xia +3 more

While Vision-Language Models (VLMs) have garnered increasing attention in the AI community due to their promising practical applications, they...

3 months ago cs.CV cs.LG PDF

Attack HIGH

Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models

Mengqi He, Xinyu Tian, Xin Shen +4 more

Vision-language models (VLMs) achieve remarkable performance but remain vulnerable to adversarial attacks. Entropy, a measure of model uncertainty,...

3 months ago cs.CV cs.LG PDF

Attack MEDIUM

Learning from Negative Examples: Why Warning-Framed Training Data Teaches What It Warns Against

Tsogt-Ochir Enkhbayar

Warning-framed content in training data (e.g., "DO NOT USE - this code is vulnerable") does not, it turns out, teach language models to avoid the...

3 months ago cs.LG cs.CL cs.CR PDF

Attack MEDIUM

Exploring the Security Threats of Retriever Backdoors in Retrieval-Augmented Code Generation

Tian Li, Bo Lin, Shangwen Wang +1 more

Retrieval-Augmented Code Generation (RACG) is increasingly adopted to enhance Large Language Models for software development, yet its security...

3 months ago cs.CR cs.SE PDF

Attack HIGH

Analysis of LLM Vulnerability to GPU Soft Errors: An Instruction-Level Fault Injection Study

Duo Chai, Zizhen Liu, Shuhuai Wang +4 more

Large language models (LLMs) are highly compute- and memory-intensive, posing significant demands on high-performance GPUs. At the same time,...

3 months ago cs.AR cs.AI cs.CR PDF

Attack HIGH

LLM-Driven Feature-Level Adversarial Attacks on Android Malware Detectors

Tianwei Lan, Farid Naït-Abdesselam

The rapid growth in both the scale and complexity of Android malware has driven the widespread adoption of machine learning (ML) techniques for...

3 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial