Attack HIGH
Sanket Badhe
We present LegalSim, a modular multi-agent simulation of adversarial legal proceedings that explores how AI systems can exploit procedural weaknesses...
5 months ago cs.MA cs.AI cs.CR
PDF
Attack HIGH
Xinzhe Huang, Wenjing Hu, Tianhang Zheng +5 more
Existing gradient-based jailbreak attacks on Large Language Models (LLMs) typically optimize adversarial suffixes to align the LLM output with...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Yu He, Yifei Chen, Yiming Li +5 more
In recent years, RAG has emerged as a key paradigm for enhancing large language models (LLMs). By integrating externally retrieved information, RAG...
Attack HIGH
Zhixin Xie, Xurui Song, Jun Luo
Despite substantial efforts in safety alignment, recent research indicates that Large Language Models (LLMs) remain highly susceptible to jailbreak...
Attack HIGH
Chinthana Wimalasuriya, Spyros Tragoudas
Adversarial attacks present a significant threat to modern machine learning systems. Yet, existing detection methods often lack the ability to detect...
5 months ago cs.CR cs.CV cs.LG
PDF
Attack HIGH
Zhaorun Chen, Xun Liu, Mintong Kang +4 more
As vision-language models (VLMs) gain prominence, their multimodal interfaces also introduce new safety vulnerabilities, making the safety evaluation...
5 months ago cs.AI cs.LG
PDF
Attack HIGH
Ruohao Guo, Afshin Oroojlooy, Roshan Sridhar +3 more
Despite recent rapid progress in AI safety, current large language models remain vulnerable to adversarial attacks in multi-turn interaction...
5 months ago cs.LG cs.AI cs.CL
PDF
Attack HIGH
Kedong Xiu, Churui Zeng, Tianhang Zheng +6 more
Existing gradient-based jailbreak attacks typically optimize an adversarial suffix to induce a fixed affirmative response, e.g., ``Sure, here...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Milad Nasr, Yanick Fratantonio, Luca Invernizzi +7 more
As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level...
5 months ago cs.CR cs.LG
PDF
Attack HIGH
John Hawkins, Aditya Pramar, Rodney Beard +1 more
Large Language Models (LLMs) suffer from a range of vulnerabilities that allow malicious users to solicit undesirable responses through manipulation...
5 months ago cs.CL cs.AI cs.CY
PDF
Attack HIGH
Isha Gupta, Rylan Schaeffer, Joshua Kazdan +2 more
The field of adversarial robustness has long established that adversarial examples can successfully transfer between image classifiers and that text...
5 months ago cs.LG cs.AI
PDF
Attack HIGH
Xiangfang Li, Yu Wang, Bo Li
With the rapid advancement of large language models (LLMs), ensuring their safe use becomes increasingly critical. Fine-tuning is a widely used...
Attack HIGH
Alexandrine Fortier, Thomas Thebaud, Jesús Villalba +2 more
Large Language Models (LLMs) and their multimodal extensions are becoming increasingly popular. One common approach to enable multimodality is to...
5 months ago cs.CL cs.CR cs.SD
PDF
Attack HIGH
Raik Dankworth, Gesina Schwalbe
Deep neural networks (NNs) for computer vision are vulnerable to adversarial attacks, i.e., miniscule malicious changes to inputs may induce...
5 months ago cs.CR cs.LG
PDF
Attack HIGH
Chenxiang Luo, David K. Y. Yau, Qun Song
Federated learning (FL) enables collaborative model training without sharing raw data but is vulnerable to gradient inversion attacks (GIAs), where...
5 months ago cs.CR cs.LG
PDF
Attack HIGH
Qinjian Zhao, Jiaqi Wang, Zhiqiang Gao +3 more
Large Language Models (LLMs) have achieved impressive performance across diverse natural language processing tasks, but their growing power also...
Attack HIGH
Xiaobao Wang, Ruoxiao Sun, Yujun Zhang +4 more
Graph Neural Networks (GNNs) have demonstrated strong performance across tasks such as node classification, link prediction, and graph...
5 months ago cs.LG cs.CR
PDF
Attack HIGH
Yein Park, Jungwoo Park, Jaewoo Kang
Large language models (LLMs), despite being safety-aligned, exhibit brittle refusal behaviors that can be circumvented by simple linguistic changes....
Attack HIGH
Yuepeng Hu, Zhengyuan Jiang, Mengyuan Li +4 more
Large language models (LLMs) are often modified after release through post-processing such as post-training or quantization, which makes it...
5 months ago cs.CR cs.CL
PDF
Attack HIGH
Yupei Liu, Yanting Wang, Yuqi Jia +2 more
Prompt injection attacks pose a pervasive threat to the security of Large Language Models (LLMs). State-of-the-art prevention-based defenses...
5 months ago cs.CR cs.AI
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial