AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1361–1380 of 2,077 papers

Attack MEDIUM

ASTRA: Agentic Steerability and Risk Assessment Framework

Itay Hazan, Yael Mathov, Guy Shtar +2 more

Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional...

4 months ago cs.CR PDF

Benchmark MEDIUM

Building Browser Agents: Architecture, Security, and Practical Solutions

Aram Vardanyan

Browser agents enable autonomous web interaction but face critical reliability and security challenges in production. This paper presents findings...

4 months ago cs.SE PDF

Attack HIGH

Federated Anomaly Detection and Mitigation for EV Charging Forecasting Under Cyberattacks

Oluleke Babayomi, Dong-Seong Kim

Electric Vehicle (EV) charging infrastructure faces escalating cybersecurity threats that can severely compromise operational efficiency and grid...

4 months ago cs.LG cs.CR PDF

Tool LOW

AI-Augmented Bibliometric Framework: A Paradigm Shift with Agentic AI for Dynamic, Snippet-Based Research Analysis

Adela Bara, Simona-Vasilica Oprea

Our paper introduces a generative, multiagent AI framework designed to overcome the rigidity, limited flexibility and technical barriers of current...

4 months ago cs.DL PDF

Attack HIGH

Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

Yunyi Zhang, Shibo Cui, Baojun Liu +4 more

LLM applications (i.e., LLM apps) leverage the powerful capabilities of LLMs to provide users with customized services, revolutionizing traditional...

4 months ago cs.CR PDF

Attack HIGH

Steering in the Shadows: Causal Amplification for Activation Space Attacks in Large Language Models

Zhiyuan Xu, Stanislav Abaimov, Joseph Gardiner +1 more

Modern large language models (LLMs) are typically secured by auditing data, prompts, and refusal policies, while treating the forward pass as an...

4 months ago cs.CR PDF

Benchmark HIGH

ReVul-CoT: Towards Effective Software Vulnerability Assessment with Retrieval-Augmented Generation and Chain-of-Thought Prompting

Zhijie Chen, Xiang Chen, Ziming Li +2 more

Context: Software Vulnerability Assessment (SVA) plays a vital role in evaluating and ranking vulnerabilities in software systems to ensure their...

4 months ago cs.SE PDF

Benchmark MEDIUM

Vision Language Models are Confused Tourists

Patrick Amadeus Irawan, Ikhlasul Akmal Hanif, Muhammad Dehan Al Kautsar +3 more

Although the cultural dimension has been one of the key aspects in evaluating Vision-Language Models (VLMs), their ability to remain stable across...

4 months ago cs.CV cs.CL PDF

Benchmark MEDIUM

Cognitive Inception: Agentic Reasoning against Visual Deceptions by Injecting Skepticism

Yinjie Zhao, Heng Zhao, Bihan Wen +1 more

As the development of AI-generated contents (AIGC), multi-modal Large Language Models (LLM) struggle to identify generated visual inputs from real...

4 months ago cs.AI PDF

Attack MEDIUM

MURMUR: Using cross-user chatter to break collaborative language agents in groups

Atharv Singh Patlan, Peiyao Sheng, S. Ashwin Hebbar +2 more

Language agents are rapidly expanding from single-user assistants to multi-user collaborators in shared workspaces and groups. However, today's...

4 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

Evaluating Adversarial Vulnerabilities in Modern Large Language Models

Tom Perel

The recent boom and rapid integration of Large Language Models (LLMs) into a wide range of applications warrants a deeper understanding of their...

4 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

AssurAI: Experience with Constructing Korean Socio-cultural Datasets to Discover Potential Risks of Generative AI

Chae-Gyun Lim, Seung-Ho Han, EunYoung Byun +51 more

The rapid evolution of generative AI necessitates robust safety evaluations. However, current safety datasets are predominantly English-centric,...

4 months ago cs.AI cs.CY cs.LG PDF

Benchmark HIGH

The Shawshank Redemption of Embodied AI: Understanding and Benchmarking Indirect Environmental Jailbreaks

Chunyang Li, Zifeng Kang, Junwei Zhang +4 more

The adoption of Vision-Language Models (VLMs) in embodied AI agents, while being effective, brings safety concerns such as jailbreaking. Prior work...

4 months ago cs.CR cs.CY cs.RO PDF

Attack HIGH

"To Survive, I Must Defect": Jailbreaking LLMs via the Game-Theory Scenarios

Zhen Sun, Zongmin Zhang, Deqi Liang +8 more

As LLMs become more common, non-expert users can pose risks, prompting extensive research into jailbreak attacks. However, most existing black-box...

4 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Q-MLLM: Vector Quantization for Robust Multimodal Large Language Model Security

Wei Zhao, Zhe Li, Yige Li +1 more

Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in cross-modal understanding, but remain vulnerable to adversarial...

4 months ago cs.CR cs.AI PDF

Attack MEDIUM

PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization

Huseein Jawad, Nicolas Brunel

System prompts are critical for guiding the behavior of Large Language Models (LLMs), yet they often contain proprietary logic or sensitive...

4 months ago cs.CR cs.CL PDF

Attack HIGH

Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models

Yijun Yang, Lichao Wang, Jianping Zhang +3 more

The growing misuse of Vision-Language Models (VLMs) has led providers to deploy multiple safeguards, including alignment tuning, system prompts, and...

4 months ago cs.CR PDF

Attack HIGH

AutoBackdoor: Automating Backdoor Attacks via LLM Agents

Yige Li, Zhe Li, Wei Zhao +4 more

Backdoor attacks pose a serious threat to the secure deployment of large language models (LLMs), enabling adversaries to implant hidden behaviors...

4 months ago cs.CR cs.AI PDF

Survey HIGH

Hiding in the AI Traffic: Abusing MCP for LLM-Powered Agentic Red Teaming

Strahinja Janjusevic, Anna Baron Garcia, Sohrob Kazerounian

Generative AI is reshaping offensive cybersecurity by enabling autonomous red team agents that can plan, execute, and adapt during penetration tests....

4 months ago cs.CR cs.AI PDF

Defense MEDIUM

Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models

Samih Fadli

Large language model safety is usually assessed with static benchmarks, but key failures are dynamic: value drift under distribution shift, jailbreak...

4 months ago cs.CL cs.AI cs.LG PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial