SHIELD: Classifier-Guided Prompting for Robust and Safer LVLMs
Juan Ren, Mark Dras, Usman Naseem
Large Vision-Language Models (LVLMs) unlock powerful multimodal reasoning but also expand the attack surface, particularly through adversarial inputs...
2,077+ academic papers on AI security, attacks, and defenses
Showing 841–860 of 986 papers
Clear filtersJuan Ren, Mark Dras, Usman Naseem
Large Vision-Language Models (LVLMs) unlock powerful multimodal reasoning but also expand the attack surface, particularly through adversarial inputs...
João A. Leite, Arnav Arora, Silvia Gargova +5 more
Large Language Models (LLMs) can generate human-like disinformation, yet their ability to personalise such content across languages and demographics...
Ruben Belo, Marta Guimaraes, Claudia Soares
Large Language Models are susceptible to jailbreak attacks that bypass built-in safety guardrails (e.g., by tricking the model with adversarial...
Siyuan Li, Aodu Wulianghai, Xi Lin +4 more
With the increasing integration of large language models (LLMs) into open-domain writing, detecting machine-generated text has become a critical task...
Daniel Pulido-Cortázar, Daniel Gibert, Felip Manyà
Over the last decade, machine learning has been extensively applied to identify malicious Android applications. However, such approaches remain...
Blazej Manczak, Eric Lin, Francisco Eiras +2 more
Large language models (LLMs) are rapidly transitioning into medical clinical use, yet their reliability under realistic, multi-turn interactions...
Han Zhu, Juntao Dai, Jiaming Ji +8 more
With the widespread use of multi-modal Large Language models (MLLMs), safety issues have become a growing concern. Multi-turn dialogues, which are...
Zhenyu Mao, Jacky Keung, Fengji Zhang +3 more
The increasing demand for software development has driven interest in automating software engineering (SE) tasks using Large Language Models (LLMs)....
Lipeng He, Vasisht Duddu, N. Asokan
Chatbot providers (e.g., OpenAI) rely on tiered subscription schemes to generate revenue, offering basic models for free users, and advanced models...
Deeksha Hareesha Kulal, Chidozie Princewill Arannonu, Afsah Anwar +2 more
Phishing remains a critical cybersecurity threat, especially with the advent of large language models (LLMs) capable of generating highly convincing...
Shuo Chen, Zonggen Li, Zhen Han +7 more
Deep Research (DR) agents built on Large Language Models (LLMs) can perform complex, multi-step research by decomposing tasks, retrieving online...
Dominik Schwarz
The security of Large Language Model (LLM) applications is fundamentally challenged by "form-first" attacks like prompt injection and jailbreaking,...
Sarah Ball, Andreas Haupt
Generative models are increasingly paired with safety classifiers that filter harmful or undesirable outputs. A common strategy is to fine-tune the...
Jiayu Ding, Lei Cui, Li Dong +2 more
Recent advances in Large Language Models (LLMs) show that extending the length of reasoning chains significantly improves performance on complex...
Sean Oesch, Jack Hutchins, Luke Koch +1 more
In living off the land attacks, malicious actors use legitimate tools and processes already present on a system to avoid detection. In this paper, we...
Rui Xu, Jiawei Chen, Zhaoxia Yin +2 more
The widespread use of large language models (LLMs) and open-source code has raised ethical and security concerns regarding the distribution and...
Jiahao Liu, Bonan Ruan, Xianglin Yang +5 more
LLM-based agents have demonstrated promising adaptability in real-world applications. However, these agents remain vulnerable to a wide range of...
Alexander Sternfeld, Andrei Kucharavy, Ljiljana Dolamic
Large language Models (LLMs) have shown remarkable proficiency in code generation tasks across various programming languages. However, their outputs...
Zhuochen Yang, Kar Wai Fok, Vrizlynn L. L. Thing
Large language models have gained widespread attention recently, but their potential security vulnerabilities, especially privacy leakage, are also...
Qizhou Peng, Yang Zheng, Yu Wen +2 more
Reinforcement learning (RL) has been an important machine learning paradigm for solving long-horizon sequential decision-making problems under...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial