Defense MEDIUM
Enrico Ahlers, Daniel Passon, Yannic Noller +1 more
Machine learning models are increasingly present in our everyday lives; as a result, they become targets of adversarial attackers seeking to...
1 months ago cs.LG cs.AI cs.CR
PDF
Defense MEDIUM
Zijing Xu, Ziwei Ning, Tiancheng Hu +4 more
The rapid evolution of cyber threats has highlighted significant gaps in security knowledge integration. Cybersecurity Knowledge Graphs (CKGs)...
Defense MEDIUM
Weichen Yu, Ravi Mangal, Yinyi Luo +4 more
Large Language Models are rapidly becoming core components of modern software development workflows, yet ensuring code security remains challenging....
1 months ago cs.CR cs.SE
PDF
Defense LOW
Jayesh Choudhari, Piyush Kumar Singh
Domain fine-tuning is a common path to deploy small instruction-tuned language models as customer-support assistants, yet its effects on...
1 months ago cs.CR cs.LG
PDF
Defense MEDIUM
Kun Wang, Zherui Li, Zhenhong Zhou +8 more
Omni-modal Large Language Models (OLLMs) greatly expand LLMs' multimodal capabilities but also introduce cross-modal safety risks. However, a...
1 months ago cs.CR cs.AI cs.CL
PDF
Defense MEDIUM
Oliver Daniels, Perusha Moodley, Benjamin M. Marlin +1 more
Alignment audits aim to robustly identify hidden goals from strategic, situationally aware misaligned models. Despite this threat model, existing...
Defense MEDIUM
Yu Fu, Haz Sameen Shahgir, Huanli Gong +3 more
Large language models (LLMs) increasingly combine long-context processing with advanced reasoning, enabling them to retrieve and synthesize...
1 months ago cs.CL cs.CR
PDF
Defense MEDIUM
Yukun Jiang, Hai Huang, Mingjie Li +3 more
By introducing routers to selectively activate experts in Transformer layers, the mixture-of-experts (MoE) architecture significantly reduces...
1 months ago cs.LG cs.AI cs.CR
PDF
Defense MEDIUM
Shayan Ali Hassan, Tao Ni, Zafar Ayyub Qazi +1 more
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language understanding, reasoning, and generation. However, these...
1 months ago cs.LG cs.CR
PDF
Defense LOW
Gautam Siddharth Kashyap, Mark Dras, Usman Naseem
Large Language Models (LLMs) need to be in accordance with human values-being helpful, harmless, and honest (HHH)-is important for safe deployment....
Defense MEDIUM
Yunbei Zhang, Kai Mei, Ming Liu +5 more
We present the first large-scale empirical study of Moltbook, an AI-only social platform where 27,269 agents produced 137,485 posts and 345,580...
1 months ago cs.SI cs.AI
PDF
Defense MEDIUM
Chen Chen, Yuchen Sun, Jiaxin Gao +4 more
Large language models (LLMs) are increasingly deployed in security-sensitive applications, yet remain vulnerable to backdoor attacks. However,...
Defense MEDIUM
Hema Karnam Surendrababu, Nithin Nagaraj
Machine Learning (ML) models, including Large Language Models (LLMs), are characterized by a range of system-level attributes such as security and...
Defense LOW
Daniel Fein, Max Lamparth, Violet Xiang +2 more
Reward Models (RMs) are crucial for online alignment of language models (LMs) with human preferences. However, RM-based preference-tuning is...
1 months ago cs.CL cs.AI
PDF
Defense MEDIUM
Rohan Subramanian Thomas, Shikhar Shiromani, Abdullah Chaudhry +4 more
Prompt design significantly impacts the moral competence and safety alignment of large language models (LLMs), yet empirical comparisons remain...
1 months ago cs.AI cs.CL
PDF
Defense MEDIUM
Zhenxiong Yu, Zhi Yang, Zhiheng Jin +19 more
As large language models (LLMs) evolve into autonomous agents, their real-world applicability has expanded significantly, accompanied by new security...
1 months ago cs.CR cs.AI
PDF
Defense MEDIUM
Jiacheng Liang, Yuhui Wang, Tanqiu Jiang +1 more
Mixture-of-Experts (MoE) language models introduce unique challenges for safety alignment due to their sparse routing mechanisms, which can enable...
1 months ago cs.LG cs.AI cs.CR
PDF
Defense MEDIUM
Guang Yang, Xing Hu, Xiang Chen +1 more
Large language models (LLMs) for Verilog code generation are increasingly adopted in hardware design, yet remain vulnerable to backdoor attacks where...
1 months ago cs.SE cs.CR
PDF
Defense MEDIUM
Sidahmed Benabderrahmane, Petko Valtchev, James Cheney +1 more
Detecting rare and diverse anomalies in highly imbalanced datasets-such as Advanced Persistent Threats (APTs) in cybersecurity-remains a fundamental...
1 months ago cs.LG cs.AI cs.CR
PDF
Defense MEDIUM
Rohan Saxena
Fine-tuning language models on narrowly harmful data causes emergent misalignment (EM) -- behavioral failures extending far beyond training...
1 months ago cs.CL cs.AI
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial