Attack HIGH
Peng Ding, Jun Kuang, Wen Sun +5 more
Large language models (LLMs) remain vulnerable to jailbreaking attacks despite their impressive capabilities. Investigating these weaknesses is...
Attack HIGH
Phil Blandfort, Robert Graham
Activation probes are attractive monitors for AI systems due to low cost and latency, but their real-world robustness remains underexplored. We ask:...
4 months ago cs.LG cs.AI
PDF
Attack HIGH
Ruofan Liu, Yun Lin, Zhiyong Huang +1 more
Large language models (LLMs) are increasingly integrated into IT infrastructures, where they process user data according to predefined instructions....
4 months ago cs.CR cs.AI
PDF
Attack HIGH
Xin Yao, Haiyang Zhao, Yimin Chen +3 more
The Contrastive Language-Image Pretraining (CLIP) model has significantly advanced vision-language modeling by aligning image-text pairs from...
4 months ago cs.CV cs.CR cs.LG
PDF
Attack HIGH
Kayua Oleques Paim, Rodrigo Brandao Mansilha, Diego Kreutz +2 more
The rapid proliferation of Large Language Models (LLMs) has raised significant concerns about their security against adversarial attacks. In this...
4 months ago cs.CR cs.AI cs.LG
PDF
Attack HIGH
Alex Irpan, Alexander Matt Turner, Mark Kurzeja +2 more
An LLM's factuality and refusal training can be compromised by simple changes to a prompt. Models often adopt user beliefs (sycophancy) or satisfy...
4 months ago cs.LG cs.AI
PDF
Attack HIGH
David Schmotz, Sahar Abdelnabi, Maksym Andriushchenko
Enabling continual learning in LLMs remains a key unresolved research challenge. In a recent announcement, a frontier LLM company made a step towards...
Attack HIGH
Zirui Cheng, Jikai Sun, Anjun Gao +4 more
Large language models (LLMs) have transformed natural language processing (NLP), enabling applications from content generation to decision support....
4 months ago cs.CR cs.IR cs.LG
PDF
Attack HIGH
Ziyao Cui, Minxing Zhang, Jian Pei
Privacy concerns have become increasingly critical in modern AI and data science applications, where sensitive information is collected, analyzed,...
4 months ago cs.CR cs.LG
PDF
Attack HIGH
Yufan Liu, Wanqian Zhang, Huashan Chen +4 more
Despite rapid advancements in text-to-image (T2I) models, their safety mechanisms are vulnerable to adversarial prompts, which maliciously generate...
Attack HIGH
Yuchong Xie, Zesen Liu, Mingyu Luo +7 more
Modern coding agents integrated into IDEs orchestrate powerful tools and high-privilege system access, creating a high-stakes attack surface. Prior...
4 months ago cs.CR cs.AI
PDF
Attack HIGH
Zesen Liu, Zhixiang Zhang, Yuchong Xie +1 more
LLM-powered agents often use prompt compression to reduce inference costs, but this introduces a new security risk. Compression modules, which are...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Dongyi Liu, Jiangtong Li, Dawei Cheng +1 more
Graph Neural Networks(GNNs) are vulnerable to backdoor attacks, where adversaries implant malicious triggers to manipulate model predictions....
5 months ago cs.CR cs.LG
PDF
Attack HIGH
Anum Paracha, Junaid Arshad, Mohamed Ben Farah +1 more
Data poisoning attacks are a potential threat to machine learning (ML) models, aiming to manipulate training datasets to disrupt their performance....
5 months ago cs.CR cs.LG
PDF
Attack HIGH
Pavlos Ntais
Large language models (LLMs) remain vulnerable to sophisticated prompt engineering attacks that exploit contextual framing to bypass safety...
5 months ago cs.CR cs.AI cs.CL
PDF
Attack HIGH
Havva Alizadeh Noughabi, Julien Serbanescu, Fattane Zarrinkalam +1 more
Despite recent advances, Large Language Models remain vulnerable to jailbreak attacks that bypass alignment safeguards and elicit harmful outputs....
5 months ago cs.CL cs.AI
PDF
Attack HIGH
Kieu Dang, Phung Lai, NhatHai Phan +3 more
Large language models (LLMs) demonstrate remarkable capabilities across various tasks. However, their deployment introduces significant risks related...
Attack HIGH
Mahavir Dabas, Tran Huynh, Nikhil Reddy Billa +8 more
Large language models remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Defending against novel...
Attack HIGH
Xingwei Zhong, Kar Wai Fok, Vrizlynn L. L. Thing
Multimodal large language models (MLLMs) comprise of both visual and textual modalities to process vision language tasks. However, MLLMs are...
Attack HIGH
Mingrui Liu, Sixiao Zhang, Cheng Long +1 more
As Large Language Models (LLMs) become integral to computing infrastructure, safety alignment serves as the primary security control preventing the...
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial