AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 461–480 of 809 papers

Clear filters

Attack MEDIUM

Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI

George Mikros

Large language models (LLMs) present a dual challenge for forensic linguistics. They serve as powerful analytical tools enabling scalable corpus...

3 months ago cs.CL cs.CY PDF

Attack MEDIUM

From Description to Score: Can LLMs Quantify Vulnerabilities?

Sima Jafarikhah, Daniel Thompson, Eva Deans +2 more

Manual vulnerability scoring, such as assigning Common Vulnerability Scoring System (CVSS) scores, is a resource-intensive process that is often...

3 months ago cs.CR cs.AI cs.PL PDF

Attack MEDIUM

Look Twice before You Leap: A Rational Agent Framework for Localized Adversarial Anonymization

Donghang Duan, Xu Zheng, Yuefeng He +3 more

Current LLM-based text anonymization frameworks usually rely on remote API services from powerful LLMs, which creates an inherent privacy paradox:...

3 months ago cs.CR cs.CL PDF

Attack HIGH

RunawayEvil: Jailbreaking the Image-to-Video Generative Models

Songping Wang, Rufan Qian, Yueming Lyu +5 more

Image-to-Video (I2V) generation synthesizes dynamic visual content from image and text inputs, providing significant creative control. However, the...

3 months ago cs.CV PDF

Attack HIGH

Metaphor-based Jailbreaking Attacks on Text-to-Image Models

Chenyu Zhang, Yiwen Ma, Lanjun Wang +3 more

Text-to-image~(T2I) models commonly incorporate defense mechanisms to prevent the generation of sensitive images. Unfortunately, recent jailbreaking...

3 months ago cs.CR cs.AI cs.CV PDF

Attack HIGH

VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Sequential Attack

Shiji Zhao, Shukun Xiong, Yao Huang +7 more

Multimodal Large Language Models (MLLMs) are widely used in various fields due to their powerful cross-modal comprehension and generation...

3 months ago cs.CV PDF

Attack HIGH

ARGUS: Defending Against Multimodal Indirect Prompt Injection via Steering Instruction-Following Behavior

Weikai Lu, Ziqian Zeng, Kehua Zhang +5 more

Multimodal Large Language Models (MLLMs) are increasingly vulnerable to multimodal Indirect Prompt Injection (IPI) attacks, which embed malicious...

3 months ago cs.CR cs.MM PDF

Attack HIGH

Safe2Harm: Semantic Isomorphism Attacks for Jailbreaking Large Language Models

Fan Yang

Large Language Models (LLMs) have demonstrated exceptional performance across various tasks, but their security vulnerabilities can be exploited by...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs

Jinbo Liu, Defu Cao, Yifei Wei +6 more

Graph topology is a fundamental determinant of memory leakage in multi-agent LLM systems, yet its effects remain poorly quantified. We introduce MAMA...

3 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

In-Context Representation Hijacking

Itay Yona, Amir Sarid, Michael Karasik +1 more

We introduce $\textbf{Doublespeak}$, a simple in-context representation hijacking attack against large language models (LLMs). The attack works by...

3 months ago cs.CL cs.AI cs.CR PDF

Attack MEDIUM

SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting

Hanxiu Zhang, Yue Zheng

The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical challenge in contemporary AI research. While...

3 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

From static to adaptive: immune memory-based jailbreak detection for large language models

Jun Leng, Yu Liu, Litian Zhang +3 more

Large Language Models (LLMs) serve as the backbone of modern AI systems, yet they remain susceptible to adversarial jailbreak attacks. Consequently,...

3 months ago cs.CR PDF

Attack MEDIUM

Invasive Context Engineering to Control Large Language Models

Thomas Rivasseau

Current research on operator control of Large Language Models improves model robustness against adversarial attacks and misbehavior by training on...

3 months ago cs.AI PDF

Attack HIGH

Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities

Yuan Xiong, Ziqi Miao, Lijun Li +3 more

While Multimodal Large Language Models (MLLMs) show remarkable capabilities, their safety alignments are susceptible to jailbreak attacks. Existing...

3 months ago cs.CV cs.CL cs.CR PDF

Attack HIGH

When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models

Afshin Khadangi, Hanna Marxen, Amir Sartipi +2 more

Frontier large language models (LLMs) such as ChatGPT, Grok and Gemini are increasingly used for mental-health support with anxiety, trauma and...

3 months ago cs.CY cs.AI PDF

Attack HIGH

Lost in Modality: Evaluating the Effectiveness of Text-Based Membership Inference Attacks on Large Multimodal Models

Ziyi Tong, Feifei Sun, Le Minh Nguyen

Large Multimodal Language Models (MLLMs) are emerging as one of the foundational tools in an expanding range of applications. Consequently,...

3 months ago cs.CR cs.AI PDF

Attack HIGH

LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems

Yuanhe Zhang, Weiliu Wang, Zhenhong Zhou +5 more

Large Language Model (LLM)-based agents have demonstrated remarkable capabilities in reasoning, planning, and tool usage. The recently proposed Model...

3 months ago cs.CR cs.CL PDF

Attack MEDIUM

Adversarial Robustness of Traffic Classification under Resource Constraints: Input Structure Matters

Adel Chehade, Edoardo Ragusa, Paolo Gastaldo +1 more

Traffic classification (TC) plays a critical role in cybersecurity, particularly in IoT and embedded contexts, where inspection must often occur...

3 months ago cs.NI cs.CR cs.LG PDF

Attack MEDIUM

CluCERT: Certifying LLM Robustness via Clustering-Guided Denoising Smoothing

Zixia Wang, Gaojie Jin, Jia Hu +1 more

Recent advancements in Large Language Models (LLMs) have led to their widespread adoption in daily applications. Despite their impressive...

3 months ago cs.LG cs.AI PDF

Attack MEDIUM

From monoliths to modules: Decomposing transducers for efficient world modelling

Alexander Boyd, Franz Nowak, David Hyland +2 more

World models have been recently proposed as sandbox environments in which AI agents can be trained and evaluated before deployment. Although...

3 months ago cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial