Abrar Shahid, Ibteeker Mahir Ishum, AKM Tahmidul Haque +2 more
This paper presents a controlled study of adversarial reinforcement learning in network security through a custom OpenAI Gym environment that models...
Adversarial attacks present a significant threat to modern machine learning systems. Yet, existing detection methods often lack the ability to detect...
As vision-language models (VLMs) gain prominence, their multimodal interfaces also introduce new safety vulnerabilities, making the safety evaluation...
Milad Nasr, Yanick Fratantonio, Luca Invernizzi +7 more
As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level...
Large Language Models (LLMs) suffer from a range of vulnerabilities that allow malicious users to solicit undesirable responses through manipulation...
Deterministic pseudo random number generators (PRNGs) used in generative artificial intelligence (GAI) models produce predictable patterns vulnerable...
5 months ago cs.LG cond-mat.mtrl-sci physics.data-an
PDF
As large language models (LLMs) advance, ensuring AI safety and alignment is paramount. One popular approach is prompt guards, lightweight mechanisms...
Isha Gupta, Rylan Schaeffer, Joshua Kazdan +2 more
The field of adversarial robustness has long established that adversarial examples can successfully transfer between image classifiers and that text...