Large Language Models (LLMs) have shown remarkable performance across various applications, but their deployment in real-world settings faces several...
Safety alignment of large language models currently faces a central challenge: existing alignment techniques often prioritize mitigating responses to...
Chongyu Fan, Changsheng Wang, Yancheng Huang +2 more
Machine unlearning for large language models (LLMs) aims to remove undesired data, knowledge, and behaviors (e.g., for safety, privacy, or copyright)...
Large Language Model (LLM)-based Multi-Agent Systems (MAS) have emerged as a powerful paradigm for tackling complex, multi-step tasks across diverse...
Christos Ziakas, Nicholas Loo, Nishita Jain +1 more
Automated red-teaming has emerged as a scalable approach for auditing Large Language Models (LLMs) prior to deployment, yet existing approaches lack...
Abhishek Anand, Matthias C. Caro, Ari Karchmer +1 more
Quantum learning from remotely accessed quantum compute and data must address two key challenges: verifying the correctness of data and ensuring the...
Although Large Language Models (LLMs) show promising solutions to automated code generation, they often produce insecure code that threatens software...
This paper presents the vision, scientific contributions, and technical details of RedTWIZ: an adaptive and diverse multi-turn red teaming framework,...
Models are susceptible to adversarially out-of-distribution (OOD) data despite large training-compute investments into their robustification. Zaremba...
Computer-use agent (CUA) frameworks, powered by large language models (LLMs) or multimodal LLMs (MLLMs), are rapidly maturing as assistants that can...
Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis
Jailbreaking large language models (LLMs) has emerged as a pressing concern with the increasing prevalence and accessibility of conversational LLMs....