AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 101–120 of 179 papers

Clear filters

Defense MEDIUM

Auto-Tuning Safety Guardrails for Black-Box Large Language Models

Perry Abdulkadir

Large language models (LLMs) are increasingly deployed behind safety guardrails such as system prompts and content filters, especially in settings...

3 months ago cs.CR cs.CL cs.LG PDF

Defense MEDIUM

Taint-Based Code Slicing for LLMs-based Malicious NPM Package Detection

Dang-Khoa Nguyen, Gia-Thang Ho, Quang-Minh Pham +5 more

Software supply chain attacks targeting the npm ecosystem have become increasingly sophisticated, leveraging obfuscation and complex logic to evade...

3 months ago cs.CR PDF

Defense MEDIUM

Super Suffixes: Bypassing Text Generation Alignment and Guard Models Simultaneously

Andrew Adiletta, Kathryn Adiletta, Kemal Derya +1 more

The rapid deployment of Large Language Models (LLMs) has created an urgent need for enhanced security and privacy measures in Machine Learning (ML)....

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

Challenges of Evaluating LLM Safety for User Welfare

Manon Kempermann, Sai Suresh Macharla Vasu, Mahalakshmi Raveenthiran +2 more

Safety evaluations of large language models (LLMs) typically focus on universal risks like dangerous capabilities or undesirable propensities....

3 months ago cs.AI cs.CY PDF

Defense MEDIUM

Phishing Email Detection Using Large Language Models

Najmul Hasan, Prashanth BusiReddyGari, Haitao Zhao +3 more

Email phishing is one of the most prevalent and globally consequential vectors of cyber intrusion. As systems increasingly deploy Large Language...

3 months ago cs.CR cs.IR PDF

Defense MEDIUM

Black-Box Behavioral Distillation Breaks Safety Alignment in Medical LLMs

Sohely Jahan, Ruimin Sun

As medical large language models (LLMs) become increasingly integrated into clinical workflows, concerns around alignment robustness, and safety are...

3 months ago cs.LG PDF

Defense MEDIUM

Secure and Privacy-Preserving Federated Learning for Next-Generation Underground Mine Safety

Mohamed Elmahallawy, Sanjay Madria, Samuel Frimpong

Underground mining operations depend on sensor networks to monitor critical parameters such as temperature, gas concentration, and miner movement,...

3 months ago cs.CR cs.LG PDF

Defense MEDIUM

MINES: Explainable Anomaly Detection through Web API Invariant Inference

Wenjie Zhang, Yun Lin, Chun Fung Amos Kwok +5 more

Detecting the anomalies of web applications, important infrastructures for running modern companies and governments, is crucial for providing...

3 months ago cs.SE cs.CR cs.DB PDF

Defense MEDIUM

CKG-LLM: LLM-Assisted Detection of Smart Contract Access Control Vulnerabilities Based on Knowledge Graphs

Xiaoqi Li, Hailu Kuang, Wenkai Li +2 more

Traditional approaches for smart contract analysis often rely on intermediate representations such as abstract syntax trees, control-flow graphs, or...

3 months ago cs.CR PDF

Defense MEDIUM

GSAE: Graph-Regularized Sparse Autoencoders for Robust LLM Safety Steering

Jehyeok Yeon, Federico Cinus, Yifan Wu +1 more

Large language models (LLMs) face critical safety challenges, as they can be manipulated to generate harmful content through adversarial prompts and...

3 months ago cs.LG cs.AI PDF

Defense MEDIUM

DEFEND: Poisoned Model Detection and Malicious Client Exclusion Mechanism for Secure Federated Learning-based Road Condition Classification

Sheng Liu, Panos Papadimitratos

Federated Learning (FL) has drawn the attention of the Intelligent Transportation Systems (ITS) community. FL can train various models for ITS tasks,...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

Matching Ranks Over Probability Yields Truly Deep Safety Alignment

Jason Vega, Gagandeep Singh

A frustratingly easy technique known as the prefilling attack has been shown to effectively circumvent the safety alignment of frontier LLMs by...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

The Forgotten Shield: Safety Grafting in Parameter-Space for Medical MLLMs

Jiale Zhao, Xing Mou, Jinlin Wu +7 more

Medical Multimodal Large Language Models (Medical MLLMs) have achieved remarkable progress in specialized medical tasks; however, research into their...

3 months ago cs.LG cs.AI cs.CL PDF

Defense MEDIUM

One Detector Fits All: Robust and Adaptive Detection of Malicious Packages from PyPI to Enterprises

Biagio Montaruli, Luca Compagna, Serena Elisa Ponta +1 more

The rise of supply chain attacks via malicious Python packages demands robust detection solutions. Current approaches, however, overlook two critical...

3 months ago cs.CR cs.LG PDF

Defense MEDIUM

Real Time Detection and Quantitative Analysis of Spurious Forgetting in Continual Learning

Weiwei Wang

Catastrophic forgetting remains a fundamental challenge in continual learning for large language models. Recent work revealed that performance...

3 months ago cs.LG cs.AI cs.CL PDF

Defense MEDIUM

The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search

Rongzhe Wei, Peizhi Niu, Xinjie Shen +7 more

Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Existing approaches...

3 months ago cs.CR PDF

Defense MEDIUM

SD-CGAN: Conditional Sinkhorn Divergence GAN for DDoS Anomaly Detection in IoT Networks

Henry Onyeka, Emmanuel Samson, Liang Hong +3 more

The increasing complexity of IoT edge networks presents significant challenges for anomaly detection, particularly in identifying sophisticated...

3 months ago cs.LG cs.CR PDF

Defense MEDIUM

Are LLMs Good Safety Agents or a Propaganda Engine?

Neemesh Yadav, Francesco Ortu, Jiarui Liu +5 more

Large Language Models (LLMs) are trained to refuse to respond to harmful content. However, systematic analyses of whether this behavior is truly a...

3 months ago cs.CL PDF

Defense MEDIUM

Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation

Junbo Zhang, Ran Chen, Qianli Zhou +2 more

Large language models demonstrate powerful capabilities across various natural language processing tasks, yet they also harbor safety...

4 months ago cs.CR cs.CL PDF

Defense MEDIUM

EAGER: Edge-Aligned LLM Defense for Robust, Efficient, and Accurate Cybersecurity Question Answering

Onat Gungor, Roshan Sood, Jiasheng Zhou +1 more

Large Language Models (LLMs) are highly effective for cybersecurity question answering (QA) but are difficult to deploy on edge devices due to their...

4 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial