ToxSearch: Evolving Prompts for Toxicity Search in Large Language Models
Onkar Shelar, Travis Desell
Large Language Models remain vulnerable to adversarial prompts that elicit toxic content even after safety alignment. We present ToxSearch, a...
2,077+ academic papers on AI security, attacks, and defenses
Showing 1421–1440 of 2,077 papers
Onkar Shelar, Travis Desell
Large Language Models remain vulnerable to adversarial prompts that elicit toxic content even after safety alignment. We present ToxSearch, a...
Jiaji Ma, Puja Trivedi, Danai Koutra
Text-attributed graphs (TAGs), which combine structural and textual node information, are ubiquitous across many domains. Recent work integrates...
Yuting Tan, Yi Huang, Zhuo Li
Backdoor attacks on large language models (LLMs) typically couple a secret trigger to an explicit malicious output. We show that this explicit...
Yikun Li, Matteo Grella, Daniel Nahmias +5 more
In recent years, Infrastructure as Code (IaC) has emerged as a critical approach for managing and provisioning IT infrastructure through code and...
Hasini Jayathilaka
Prompt injection attacks are an emerging threat to large language models (LLMs), enabling malicious users to manipulate outputs through carefully...
Rui Wang, Zeming Wei, Xiyue Zhang +1 more
Deep Neural Networks (DNNs) are known to be vulnerable to various adversarial perturbations. To address the safety concerns arising from these...
Gil Goren, Shahar Katz, Lior Wolf
Large Language Models (LLMs) are vulnerable to adversarial attacks that bypass safety guidelines and generate harmful content. Mitigating these...
Jie Chen, Liangmin Wang
Fuzzing is a widely used technique for detecting vulnerabilities in smart contracts, which generates transaction sequences to explore the execution...
Thong Bach, Dung Nguyen, Thao Minh Le +1 more
Large language models exhibit systematic vulnerabilities to adversarial attacks despite extensive safety alignment. We provide a mechanistic analysis...
Jiayu Li, Yunhan Zhao, Xiang Zheng +4 more
Vision-Language-Action (VLA) models enable robots to interpret natural-language instructions and perform diverse tasks, yet their integration of...
Sajad U P
Phishing and related cyber threats are becoming more varied and technologically advanced. Among these, email-based phishing remains the most dominant...
Shaowei Guan, Yu Zhai, Zhengyu Zhang +2 more
Large Language Models (LLMs) are increasingly vulnerable to adversarial attacks that can subtly manipulate their outputs. While various defense...
Shanmin Wang, Dongdong Zhao
Knowledge Distillation (KD) is essential for compressing large models, yet relying on pre-trained "teacher" models downloaded from third-party...
Hao Li, Jiajun He, Guangshuo Wang +3 more
Retrieval-Augmented Generation (RAG) enhances large language models by integrating external knowledge, but reliance on proprietary or sensitive...
Lucas Fenaux, Christopher Srinivasa, Florian Kerschbaum
Transparency and security are both central to Responsible AI, but they may conflict in adversarial settings. We investigate the strategic effect of...
Xingshuang Lin, Binbin Zhao, Jinwen Wang +3 more
Smart Contract Reusable Components(SCRs) play a vital role in accelerating the development of business-specific contracts by promoting modularity and...
Gioliano de Oliveira Braga, Pedro Henrique dos Santos Rocha, Rafael Pimenta de Mattos Paixão +3 more
Wi-Fi Channel State Information (CSI) has been repeatedly proposed as a biometric modality, often with reports of high accuracy and operational...
Yanbo Dai, Zongjie Li, Zhenlan Ji +1 more
Large language models (LLMs) have achieved remarkable success across a wide range of natural language processing tasks, demonstrating human-level...
Lama Sleem, Jerome Francois, Lujun Li +3 more
Jailbreak attacks designed to bypass safety mechanisms pose a serious threat by prompting LLMs to generate harmful or inappropriate content, despite...
Shaowei Guan, Hin Chi Kwok, Ngai Fong Law +3 more
Retrieval-augmented generation (RAG) has rapidly emerged as a transformative approach for integrating large language models into clinical and...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial