AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 601–620 of 986 papers

Clear filters

Attack MEDIUM

Invasive Context Engineering to Control Large Language Models

Thomas Rivasseau

Current research on operator control of Large Language Models improves model robustness against adversarial attacks and misbehavior by training on...

3 months ago cs.AI PDF

Other MEDIUM

Decentralized Multi-Agent System with Trust-Aware Communication

Yepeng Ding, Ahmed Twabi, Junwei Yu +3 more

The emergence of Large Language Models (LLMs) is rapidly accelerating the development of autonomous multi-agent systems (MAS), paving the way for the...

3 months ago cs.MA cs.CR PDF

Defense MEDIUM

Real Time Detection and Quantitative Analysis of Spurious Forgetting in Continual Learning

Weiwei Wang

Catastrophic forgetting remains a fundamental challenge in continual learning for large language models. Recent work revealed that performance...

3 months ago cs.LG cs.AI cs.CL PDF

Benchmark MEDIUM

COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

Junyu Wang, Changjia Zhu, Yuanbo Zhou +3 more

This paper studies how multimodal large language models (MLLMs) undermine the security guarantees of visual CAPTCHA. We identify the attack surface...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

Adversarial Robustness of Traffic Classification under Resource Constraints: Input Structure Matters

Adel Chehade, Edoardo Ragusa, Paolo Gastaldo +1 more

Traffic classification (TC) plays a critical role in cybersecurity, particularly in IoT and embedded contexts, where inspection must often occur...

3 months ago cs.NI cs.CR cs.LG PDF

Attack MEDIUM

CluCERT: Certifying LLM Robustness via Clustering-Guided Denoising Smoothing

Zixia Wang, Gaojie Jin, Jia Hu +1 more

Recent advancements in Large Language Models (LLMs) have led to their widespread adoption in daily applications. Despite their impressive...

3 months ago cs.LG cs.AI PDF

Attack MEDIUM

From monoliths to modules: Decomposing transducers for efficient world modelling

Alexander Boyd, Franz Nowak, David Hyland +2 more

World models have been recently proposed as sandbox environments in which AI agents can be trained and evaluated before deployment. Although...

3 months ago cs.AI PDF

Attack MEDIUM

Factor(T,U): Factored Cognition Strengthens Monitoring of Untrusted AI

Aaron Sandoval, Cody Rushing

The field of AI Control seeks to develop robust control protocols, deployment safeguards for untrusted AI which may be intentionally subversive....

3 months ago cs.CR cs.CL PDF

Attack MEDIUM

Many-to-One Adversarial Consensus: Exposing Multi-Agent Collusion Risks in AI-Based Healthcare

Adeela Bashir, The Anh han, Zia Ush Shamszaman

The integration of large language models (LLMs) into healthcare IoT systems promises faster decisions and improved medical support. LLMs are also...

3 months ago cs.CR cs.LG cs.MA PDF

Defense MEDIUM

The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search

Rongzhe Wei, Peizhi Niu, Xinjie Shen +7 more

Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Existing approaches...

3 months ago cs.CR PDF

Benchmark MEDIUM

EmoRAG: Evaluating RAG Robustness to Symbolic Perturbations

Xinyun Zhou, Xinfeng Li, Yinan Peng +9 more

Retrieval-Augmented Generation (RAG) systems are increasingly central to robust AI, enhancing large language model (LLM) faithfulness by...

3 months ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

Systems Security Foundations for Agentic Computing

Mihai Christodorescu, Earlence Fernandes, Ashish Hooda +11 more

In recent years, agentic artificial intelligence (AI) systems are becoming increasingly widespread. These systems allow agents to use various tools,...

3 months ago cs.CR PDF

Benchmark MEDIUM

Large Language Models Cannot Reliably Detect Vulnerabilities in JavaScript: The First Systematic Benchmark and Evaluation

Qingyuan Fei, Xin Liu, Song Li +4 more

Researchers have proposed numerous methods to detect vulnerabilities in JavaScript, especially those assisted by Large Language Models (LLMs)....

3 months ago cs.CR cs.CL cs.SE PDF

Attack MEDIUM

On the Regulatory Potential of User Interfaces for AI Agent Governance

K. J. Kevin Feng, Tae Soo Kim, Rock Yuren Pang +3 more

AI agents that take actions in their environment autonomously over extended time horizons require robust governance interventions to curb their...

3 months ago cs.CY cs.AI PDF

Benchmark MEDIUM

Pruning Graphs by Adversarial Robustness Evaluation to Strengthen GNN Defenses

Yongyu Wang

Graph Neural Networks (GNNs) have emerged as a dominant paradigm for learning on graph-structured data, thanks to their ability to jointly exploit...

3 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

Large Language Model based Smart Contract Auditing with LLMBugScanner

Yining Yuan, Yifei Wang, Yichang Xu +3 more

This paper presents LLMBugScanner, a large language model (LLM) based framework for smart contract vulnerability detection using fine-tuning and...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Password-Activated Shutdown Protocols for Misaligned Frontier Agents

Kai Williams, Rohan Subramani, Francis Rhys Ward

Frontier AI developers may fail to align or control highly-capable AI agents. In many cases, it could be useful to have emergency shutdown mechanisms...

3 months ago cs.CR cs.AI cs.CY PDF

Defense MEDIUM

SD-CGAN: Conditional Sinkhorn Divergence GAN for DDoS Anomaly Detection in IoT Networks

Henry Onyeka, Emmanuel Samson, Liang Hong +3 more

The increasing complexity of IoT edge networks presents significant challenges for anomaly detection, particularly in identifying sophisticated...

3 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

Evaluating LLMs for One-Shot Patching of Real and Artificial Vulnerabilities

Aayush Garg, Zanis Ali Khan, Renzo Degiovanni +1 more

Automated vulnerability patching is crucial for software security, and recent advancements in Large Language Models (LLMs) present promising...

3 months ago cs.CR cs.AI cs.SE PDF

Defense MEDIUM

Are LLMs Good Safety Agents or a Propaganda Engine?

Neemesh Yadav, Francesco Ortu, Jiarui Liu +5 more

Large Language Models (LLMs) are trained to refuse to respond to harmful content. However, systematic analyses of whether this behavior is truly a...

3 months ago cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial