AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 161–180 of 341 papers

Clear filters

Attack HIGH

Enhancing Network Intrusion Detection Systems: A Multi-Layer Ensemble Approach to Mitigate Adversarial Attacks

Nasim Soltani, Shayan Nejadshamsi, Zakaria Abou El Houda +4 more

Adversarial examples can represent a serious threat to machine learning (ML) algorithms. If used to manipulate the behaviour of ML-based Network...

2 weeks ago cs.CR cs.AI PDF

Benchmark MEDIUM

CLaRE-ty Amid Chaos: Quantifying Representational Entanglement to Predict Ripple Effects in LLM Editing

Manit Baser, Alperen Yildiz, Dinil Mon Divakaran +1 more

The static knowledge representations of large language models (LLMs) inevitably become outdated or incorrect over time. While model-editing...

2 weeks ago cs.LG PDF

Tool MEDIUM

Don't Let the Claw Grip Your Hand: A Security Analysis and Defense Framework for OpenClaw

Zhengyang Shan, Jiayun Xin, Yue Zhang +1 more

Code agents powered by large language models can execute shell commands on behalf of users, introducing severe security vulnerabilities. This paper...

2 weeks ago cs.CR PDF

Attack HIGH

Semantic Chameleon: Corpus-Dependent Poisoning Attacks and Defenses in RAG Systems

Scott Thornton

Retrieval-Augmented Generation (RAG) systems extend large language models (LLMs) with external knowledge sources but introduce new attack surfaces...

2 weeks ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

Paladin: A Policy Framework for Securing Cloud APIs by Combining Application Context with Generative AI

Shriti Priya, Julian James Stephen, Arjun Natarajan

Enterprises and organizations today increasingly deploy in-house, cloud based applications and APIs for internal operations or external customers....

2 weeks ago cs.CR PDF

Attack MEDIUM

MCP-in-SoS: Risk assessment framework for open-source MCP servers

Pratyay Kumar, Miguel Antonio Guirao Aguilera, Srikathyayani Srikanteswara +2 more

Model Context Protocol (MCP) servers have rapidly emerged over the past year as a widely adopted way to enable Large Language Model (LLM) agents to...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Compatibility at a Cost: Systematic Discovery and Exploitation of MCP Clause-Compliance Vulnerabilities

Nanzi Yang, Weiheng Bai, Kangjie Lu

The Model Context Protocol (MCP) is a recently proposed interoperability standard that unifies how AI agents connect with external tools and data...

2 weeks ago cs.CR cs.AI PDF

Tool LOW

Reason and Verify: A Framework for Faithful Retrieval-Augmented Generation

Eeham Khan, Luis Rodriguez, Marc Queudot

Retrieval-Augmented Generation (RAG) significantly improves the factuality of Large Language Models (LLMs), yet standard pipelines often lack...

2 weeks ago cs.CL PDF

Attack HIGH

Execution Is the New Attack Surface: Survivability-Aware Agentic Crypto Trading with OpenClaw-Style Local Executors

Ailiya Borjigin, Igor Stadnyk, Ben Bilski +2 more

OpenClaw-style agent stacks turn language into privileged execution: LLM intents flow through tool interception, policy gates, and a local executor....

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference

Fan Yang

The widespread adoption of thinking mode in large language models (LLMs) has significantly enhanced complex task processing capabilities while...

2 weeks ago cs.CR cs.AI PDF

Attack MEDIUM

CLIOPATRA: Extracting Private Information from LLM Insights

Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro, Peter Kairouz

As AI assistants become widely used, privacy-aware platforms like Anthropic's Clio have been introduced to generate insights from real-world AI use....

2 weeks ago cs.CR PDF

Attack MEDIUM

Compartmentalization-Aware Automated Program Repair

Jia Hu, Youcheng Sun, Pierre Olivier

Software compartmentalization breaks down an application into compartments isolated from each other: an attacker taking over a compartment will be...

2 weeks ago cs.CR PDF

Attack MEDIUM

Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models

Ali Raza, Gurang Gupta, Nikolay Matyunin +1 more

Warning: This article includes red-teaming experiments, which contain examples of compromised LLM responses that may be offensive or upsetting. Large...

2 weeks ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models

Quanchen Zou, Moyang Chen, Zonghao Ying +6 more

Large Vision-Language Models (LVLMs) undergo safety alignment to suppress harmful content. However, current defenses predominantly target explicit...

2 weeks ago cs.CR PDF

Benchmark MEDIUM

Why LLMs Fail: A Failure Analysis and Partial Success Measurement for Automated Security Patch Generation

Amir Al-Maamari

Large Language Models (LLMs) show promise for Automated Program Repair (APR), yet their effectiveness on security vulnerabilities remains poorly...

2 weeks ago cs.CR cs.AI PDF

Attack MEDIUM

AgenticCyOps: Securing Multi-Agentic AI Integration in Enterprise Cyber Operations

Shaswata Mitra, Raj Patel, Sudip Mittal +2 more

Multi-agent systems (MAS) powered by LLMs promise adaptive, reasoning-driven enterprise workflows, yet granting agents autonomous control over tools,...

2 weeks ago cs.CR cs.MA cs.SE PDF

Defense MEDIUM

ADVERSA: Measuring Multi-Turn Guardrail Degradation and Judge Reliability in Large Language Models

Harry Owiredu-Ashley

Most adversarial evaluations of large language model (LLM) safety assess single prompts and report binary pass/fail outcomes, which fails to capture...

2 weeks ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation

Yinpeng Wu, Yitong Chen, Lixiang Wang +3 more

Device-side Large Language Models (LLMs) have witnessed explosive growth, offering higher privacy and availability compared to cloud-side LLMs....

2 weeks ago cs.CR cs.LG cs.OS PDF

Attack HIGH

NetDiffuser: Deceiving DNN-Based Network Attack Detection Systems with Diffusion-Generated Adversarial Traffic

Pratyay Kumar, Abu Saleh Md Tayeen, Satyajayant Misra +4 more

Deep learning (DL)-based Network Intrusion Detection System (NIDS) has demonstrated great promise in detecting malicious network traffic. However,...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

David Fernandez, Pedram MohajerAnsari, Amir Salarpour +3 more

Vision-language models are emerging for autonomous driving, yet their robustness to physical adversarial attacks remains unexplored. This paper...

2 weeks ago cs.CV PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial