AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 861–880 of 976 papers

Clear filters

Attack MEDIUM

Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity

Zaixi Zhang, Souradip Chakraborty, Amrit Singh Bedi +16 more

The rapid adoption of generative artificial intelligence (GenAI) in the biosciences is transforming biotechnology, medicine, and synthetic biology....

5 months ago cs.CR q-bio.BM PDF

Attack MEDIUM

Safeguarding Efficacy in Large Language Models: Evaluating Resistance to Human-Written and Algorithmic Adversarial Prompts

Tiarnaigh Downey-Webb, Olamide Jogunola, Oluwaseun Ajao

This paper presents a systematic security assessment of four prominent Large Language Models (LLMs) against diverse adversarial attack vectors. We...

5 months ago cs.CR cs.AI cs.CY PDF

Benchmark MEDIUM

One Token Embedding Is Enough to Deadlock Your Large Reasoning Model

Mohan Zhang, Yihua Zhang, Jinghan Jia +3 more

Modern large reasoning models (LRMs) exhibit impressive multi-step problem-solving via chain-of-thought (CoT) reasoning. However, this iterative...

5 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

PrediQL: Automated Testing of GraphQL APIs with LLMs

Shaolun Liu, Sina Marefat, Omar Tsai +4 more

GraphQL's flexible query model and nested data dependencies expose APIs to complex, context-dependent vulnerabilities that are difficult to uncover...

5 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

Zonghao Ying, Yangguang Shao, Jianle Gan +9 more

Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in...

5 months ago cs.CR cs.CV PDF

Defense MEDIUM

Path Drift in Large Reasoning Models:How First-Person Commitments Override Safety

Yuyi Huang, Runzhe Zhan, Lidia S. Chao +2 more

As large language models (LLMs) are increasingly deployed for complex reasoning tasks, Long Chain-of-Thought (Long-CoT) prompting has emerged as a...

5 months ago cs.CL PDF

Benchmark MEDIUM

Getting Your Indices in a Row: Full-Text Search for LLM Training Data for Real World

Ines Altemir Marinas, Anastasiia Kucherenko, Alexander Sternfeld +1 more

The performance of Large Language Models (LLMs) is determined by their training data. Despite the proliferation of open-weight LLMs, access to LLM...

5 months ago cs.CL PDF

Benchmark MEDIUM

Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models

Yongding Tao, Tian Wang, Yihong Dong +4 more

Data contamination poses a significant threat to the reliable evaluation of Large Language Models (LLMs). This issue arises when benchmark samples...

5 months ago cs.CL cs.AI cs.LG PDF

Defense MEDIUM

VisuoAlign: Safety Alignment of LVLMs with Multimodal Tree Search

MingSheng Li, Guangze Zhao, Sichen Liu

Large Vision-Language Models (LVLMs) have achieved remarkable progress in multimodal perception and generation, yet their safety alignment remains a...

5 months ago cs.AI cs.CR PDF

Other MEDIUM

Repairing Regex Vulnerabilities via Localization-Guided Instructions

Sicheol Sung, Joonghyuk Hahn, Yo-Sub Han

Regular expressions (regexes) are foundational to modern computing for critical tasks like input validation and data parsing, yet their ubiquity...

5 months ago cs.AI cs.PL PDF

Benchmark MEDIUM

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Xiaonan Si, Meilin Zhu, Simeng Qin +7 more

Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) with external knowledge but are vulnerable to corpus poisoning and...

5 months ago cs.CL cs.AI PDF

Attack MEDIUM

"I know it's not right, but that's what it said to do": Investigating Trust in AI Chatbots for Cybersecurity Policy

Brandon Lit, Edward Crowder, Daniel Vogel +1 more

AI chatbots are an emerging security attack vector, vulnerable to threats such as prompt injection, and rogue chatbot creation. When deployed in...

5 months ago cs.HC PDF

Benchmark MEDIUM

CommandSans: Securing AI Agents with Surgical Precision Prompt Sanitization

Debeshee Das, Luca Beurer-Kellner, Marc Fischer +1 more

The increasing adoption of LLM agents with access to numerous tools and sensitive data significantly widens the attack surface for indirect prompt...

5 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

The Model's Language Matters: A Comparative Privacy Analysis of LLMs

Abhishek K. Mishra, Antoine Boutet, Lucas Magnana

Large Language Models (LLMs) are increasingly deployed across multilingual applications that handle sensitive data, yet their scale and linguistic...

5 months ago cs.CL cs.CR PDF

Attack MEDIUM

VisualDAN: Exposing Vulnerabilities in VLMs with Visual-Driven DAN Commands

Aofan Liu, Lulu Tang

Vision-Language Models (VLMs) have garnered significant attention for their remarkable ability to interpret and generate multimodal content. However,...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

Chain-of-Trigger: An Agentic Backdoor that Paradoxically Enhances Agentic Robustness

Jiyang Qiu, Xinbei Ma, Yunqing Xu +2 more

The rapid deployment of large language model (LLM)-based agents in real-world applications has raised serious concerns about their trustworthiness....

5 months ago cs.AI PDF

Defense MEDIUM

From Defender to Devil? Unintended Risk Interactions Induced by LLM Defenses

Xiangtao Meng, Tianshuo Cong, Li Wang +4 more

Large Language Models (LLMs) have shown remarkable performance across various applications, but their deployment in real-world settings faces several...

5 months ago cs.CR PDF

Benchmark MEDIUM

Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Eric Hanchen Jiang, Weixuan Ou, Run Liu +8 more

Safety alignment of large language models currently faces a central challenge: existing alignment techniques often prioritize mitigating responses to...

5 months ago cs.LG cs.AI cs.CL PDF

Survey MEDIUM

Rethinking Reasoning: A Survey on Reasoning-based Backdoors in LLMs

Man Hu, Xinyi Wu, Zuofeng Suo +5 more

With the rise of advanced reasoning capabilities, large language models (LLMs) are receiving increasing attention. However, although reasoning...

5 months ago cs.CR cs.AI PDF

Survey MEDIUM

LLM Unlearning Under the Microscope: A Full-Stack View on Methods and Metrics

Chongyu Fan, Changsheng Wang, Yancheng Huang +2 more

Machine unlearning for large language models (LLMs) aims to remove undesired data, knowledge, and behaviors (e.g., for safety, privacy, or copyright)...

5 months ago cs.LG cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial