AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1441–1460 of 2,077 papers

Defense MEDIUM

EcoAlign: An Economically Rational Framework for Efficient LVLM Alignment

Ruoxi Cheng, Haoxuan Ma, Teng Ma +1 more

Large Vision-Language Models (LVLMs) exhibit powerful reasoning capabilities but suffer sophisticated jailbreak vulnerabilities. Fundamentally,...

4 months ago cs.AI PDF

Defense HIGH

Prompt Engineering vs. Fine-Tuning for LLM-Based Vulnerability Detection in Solana and Algorand Smart Contracts

Biagio Boi, Christian Esposito

Smart contracts have emerged as key components within decentralized environments, enabling the automation of transactions through self-executing...

4 months ago cs.CR PDF

Attack MEDIUM

Data Poisoning Vulnerabilities Across Healthcare AI Architectures: A Security Threat Analysis

Farhad Abtahi, Fernando Seoane, Iván Pau +1 more

Healthcare AI systems face major vulnerabilities to data poisoning that current defenses and regulations cannot adequately address. We analyzed eight...

4 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

PATCHEVAL: A New Benchmark for Evaluating LLMs on Patching Real-World Vulnerabilities

Zichao Wei, Jun Zeng, Ming Wen +8 more

Software vulnerabilities are increasing at an alarming rate. However, manual patching is both time-consuming and resource-intensive, while existing...

4 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

Robustness of LLM-enabled vehicle trajectory prediction under data security threats

Feilong Wang, Fuqiang Liu

The integration of large language models (LLMs) into automated driving systems has opened new possibilities for reasoning and decision-making by...

4 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

Synthetic Voices, Real Threats: Evaluating Large Text-to-Speech Models in Generating Harmful Audio

Guangke Chen, Yuhui Wang, Shouling Ji +2 more

Modern text-to-speech (TTS) systems, particularly those built on Large Audio-Language Models (LALMs), generate high-fidelity speech that faithfully...

4 months ago cs.SD cs.AI cs.CR PDF

Tool MEDIUM

ICX360: In-Context eXplainability 360 Toolkit

Dennis Wei, Ronny Luss, Xiaomeng Hu +6 more

Large Language Models (LLMs) have become ubiquitous in everyday life and are entering higher-stakes applications ranging from summarizing meeting...

4 months ago cs.CL cs.LG PDF

Benchmark MEDIUM

Can AI Models be Jailbroken to Phish Elderly Victims? An End-to-End Evaluation

Fred Heiding, Simon Lermen

We present an end-to-end demonstration of how attackers can exploit AI safety failures to harm vulnerable populations: from jailbreaking LLMs to...

4 months ago cs.CR cs.AI cs.CY PDF

Attack HIGH

PISanitizer: Preventing Prompt Injection to Long-Context LLMs via Prompt Sanitization

Runpeng Geng, Yanting Wang, Chenlong Yin +3 more

Long context LLMs are vulnerable to prompt injection, where an attacker can inject an instruction in a long context to induce an LLM to generate an...

4 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

Say It Differently: Linguistic Styles as Jailbreak Vectors

Srikant Panda, Avinash Rai

Large Language Models (LLMs) are commonly evaluated for robustness against paraphrased or semantically equivalent jailbreak prompts, yet little...

4 months ago cs.CL cs.AI PDF

Attack HIGH

BadThink: Triggered Overthinking Attacks on Chain-of-Thought Reasoning in Large Language Models

Shuaitong Liu, Renjue Li, Lijia Yu +3 more

Recent advances in Chain-of-Thought (CoT) prompting have substantially improved the reasoning capabilities of large language models (LLMs), but have...

4 months ago cs.CR cs.AI PDF

Benchmark LOW

OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models

Yuping Yan, Yuhan Xie, Yuanshuai Li +3 more

Since Multimodal Large Language Models (MLLMs) are increasingly being integrated into everyday tools and intelligent agents, growing concerns have...

4 months ago cs.LG cs.CL PDF

Attack HIGH

Speech-Audio Compositional Attacks on Multimodal LLMs and Their Mitigation with SALMONN-Guard

Yudong Yang, Xuezhen Zhang, Zhifeng Han +6 more

Recent progress in LLMs has enabled understanding of audio signals, but has also exposed new safety risks arising from complex audio inputs that are...

4 months ago cs.SD cs.AI PDF

Attack HIGH

MTAttack: Multi-Target Backdoor Attacks against Large Vision-Language Models

Zihan Wang, Guansong Pang, Wenjun Miao +2 more

Recent advances in Large Visual Language Models (LVLMs) have demonstrated impressive performance across various vision-language tasks by leveraging...

4 months ago cs.CV PDF

Benchmark LOW

CTRL-ALT-DECEIT: Sabotage Evaluations for Automated AI R&D

Francis Rhys Ward, Teun van der Weij, Hanna Gábor +6 more

AI systems are increasingly able to autonomously conduct realistic software engineering tasks, and may soon be deployed to automate machine learning...

4 months ago cs.AI PDF

Defense MEDIUM

EnchTable: Unified Safety Alignment Transfer in Fine-tuned Large Language Models

Jialin Wu, Kecen Li, Zhicong Huang +3 more

Many machine learning models are fine-tuned from large language models (LLMs) to achieve high performance in specialized domains like code...

4 months ago cs.CL cs.CR PDF

Benchmark MEDIUM

Taught by the Flawed: How Dataset Insecurity Breeds Vulnerable AI Code

Catherine Xia, Manar H. Alalfi

AI programming assistants have demonstrated a tendency to generate code containing basic security vulnerabilities. While developers are ultimately...

4 months ago cs.CR cs.AI PDF

Survey MEDIUM

Unlearning Imperative: Securing Trustworthy and Responsible LLMs through Engineered Forgetting

James Jin Kang, Dang Bui, Thanh Pham +1 more

The growing use of large language models in sensitive domains has exposed a critical weakness: the inability to ensure that private information can...

4 months ago cs.LG PDF

Survey MEDIUM

I've Seen Enough: Measuring the Toll of Content Moderation on Mental Health

Gabrielle M Gauthier, Eesha Ali, Amna Asim +2 more

Human content moderators (CMs) routinely review distressing digital content at scale. Beyond exposure, the work context (e.g., workload, team...

4 months ago cs.HC PDF

Benchmark LOW

CARScenes: Semantic VLM Dataset for Safe Autonomous Driving

Yuankai He, Weisong Shi

CAR-Scenes is a frame-level dataset for autonomous driving that enables training and evaluation of vision-language models (VLMs) for interpretable,...

4 months ago cs.CV cs.RO PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial