AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 501–520 of 809 papers

Clear filters

Attack HIGH

Medusa: Cross-Modal Transferable Adversarial Attacks on Multimodal Medical Retrieval-Augmented Generation

Yingjia Shang, Yi Liu, Huimin Wang +4 more

With the rapid advancement of retrieval-augmented vision-language models, multimodal medical retrieval-augmented generation (MMed-RAG) systems are...

4 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization

Md Akil Raihan Iftee, Syed Md. Ahnaf Hasan, Amin Ahsan Ali +3 more

Test-time personalization in federated learning enables models at clients to adjust online to local domain shifts, enhancing robustness and...

4 months ago cs.CR cs.CV PDF

Attack HIGH

Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization

Xurui Li, Kaisong Song, Rui Zhu +2 more

Large Language Models (LLMs) have developed rapidly in web services, delivering unprecedented capabilities while amplifying societal risks. Existing...

4 months ago cs.CR cs.AI PDF

Attack HIGH

AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents

Yixin Wu, Rui Wen, Chi Cui +2 more

Inference attacks have been widely studied and offer a systematic risk assessment of ML services; however, their implementation and the attack...

4 months ago cs.CR cs.AI PDF

Attack HIGH

Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations

Ryan Wong, Hosea David Yu Fei Ng, Dhananjai Sharma +2 more

Large Language Models (LLMs) remain susceptible to jailbreak exploits that bypass safety filters and induce harmful or unethical behavior. This work...

4 months ago cs.CR cs.AI PDF

Attack MEDIUM

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM

Adarsh Kumarappan, Ayushi Mehrotra

The SmoothLLM defense provides a certification guarantee against jailbreaking attacks, but it relies on a strict "k-unstable" assumption that rarely...

4 months ago cs.LG cs.AI PDF

Attack HIGH

Automating Deception: Scalable Multi-Turn LLM Jailbreaks

Adarsh Kumarappan, Ananya Mujoo

Multi-turn conversational attacks, which leverage psychological principles like Foot-in-the-Door (FITD), where a small initial request paves the way...

4 months ago cs.LG cs.AI PDF

Attack HIGH

Semantics as a Shield: Label Disguise Defense (LDD) against Prompt Injection in LLM Sentiment Classification

Yanxi Li, Ruocheng Shan

Large language models are increasingly used for text classification tasks such as sentiment analysis, yet their reliance on natural language prompts...

4 months ago cs.CL cs.AI PDF

Attack HIGH

TASO: Jailbreak LLMs via Alternative Template and Suffix Optimization

Yanting Wang, Runpeng Geng, Jinghui Chen +2 more

Many recent studies showed that LLMs are vulnerable to jailbreak attacks, where an attacker can perturb the input of an LLM to induce it to generate...

4 months ago cs.CR PDF

Attack HIGH

Exploiting the Experts: Unauthorized Compression in MoE-LLMs

Pinaki Prasad Guha Neogi, Ahmad Mohammadshirazi, Dheeraj Kulshrestha +1 more

Mixture-of-Experts (MoE) architectures are increasingly adopted in large language models (LLMs) for their scalability and efficiency. However, their...

4 months ago cs.LG cs.AI PDF

Attack HIGH

Vulnerability-Aware Robust Multimodal Adversarial Training

Junrui Zhang, Xinyu Zhao, Jie Peng +3 more

Multimodal learning has shown significant superiority on various tasks by integrating multiple modalities. However, the interdependencies among...

4 months ago cs.LG cs.CR PDF

Attack MEDIUM

ASTRA: Agentic Steerability and Risk Assessment Framework

Itay Hazan, Yael Mathov, Guy Shtar +2 more

Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional...

4 months ago cs.CR PDF

Attack HIGH

Federated Anomaly Detection and Mitigation for EV Charging Forecasting Under Cyberattacks

Oluleke Babayomi, Dong-Seong Kim

Electric Vehicle (EV) charging infrastructure faces escalating cybersecurity threats that can severely compromise operational efficiency and grid...

4 months ago cs.LG cs.CR PDF

Attack HIGH

Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

Yunyi Zhang, Shibo Cui, Baojun Liu +4 more

LLM applications (i.e., LLM apps) leverage the powerful capabilities of LLMs to provide users with customized services, revolutionizing traditional...

4 months ago cs.CR PDF

Attack HIGH

Steering in the Shadows: Causal Amplification for Activation Space Attacks in Large Language Models

Zhiyuan Xu, Stanislav Abaimov, Joseph Gardiner +1 more

Modern large language models (LLMs) are typically secured by auditing data, prompts, and refusal policies, while treating the forward pass as an...

4 months ago cs.CR PDF

Attack MEDIUM

MURMUR: Using cross-user chatter to break collaborative language agents in groups

Atharv Singh Patlan, Peiyao Sheng, S. Ashwin Hebbar +2 more

Language agents are rapidly expanding from single-user assistants to multi-user collaborators in shared workspaces and groups. However, today's...

4 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

Evaluating Adversarial Vulnerabilities in Modern Large Language Models

Tom Perel

The recent boom and rapid integration of Large Language Models (LLMs) into a wide range of applications warrants a deeper understanding of their...

4 months ago cs.CR cs.AI PDF

Attack HIGH

"To Survive, I Must Defect": Jailbreaking LLMs via the Game-Theory Scenarios

Zhen Sun, Zongmin Zhang, Deqi Liang +8 more

As LLMs become more common, non-expert users can pose risks, prompting extensive research into jailbreak attacks. However, most existing black-box...

4 months ago cs.CR cs.AI PDF

Attack MEDIUM

PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization

Huseein Jawad, Nicolas Brunel

System prompts are critical for guiding the behavior of Large Language Models (LLMs), yet they often contain proprietary logic or sensitive...

4 months ago cs.CR cs.CL PDF

Attack HIGH

Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models

Yijun Yang, Lichao Wang, Jianping Zhang +3 more

The growing misuse of Vision-Language Models (VLMs) has led providers to deploy multiple safeguards, including alignment tuning, system prompts, and...

4 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial