AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 481–500 of 986 papers

Clear filters

Benchmark MEDIUM

It's a TRAP! Task-Redirecting Agent Persuasion Benchmark for Web Agents

Karolina Korgul, Yushi Yang, Arkadiusz Drohomirecki +7 more

Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking. Their reliance...

2 months ago cs.HC cs.AI cs.MA PDF

Attack MEDIUM

Learning from Negative Examples: Why Warning-Framed Training Data Teaches What It Warns Against

Tsogt-Ochir Enkhbayar

Warning-framed content in training data (e.g., "DO NOT USE - this code is vulnerable") does not, it turns out, teach language models to avoid the...

3 months ago cs.LG cs.CL cs.CR PDF

Attack MEDIUM

Exploring the Security Threats of Retriever Backdoors in Retrieval-Augmented Code Generation

Tian Li, Bo Lin, Shangwen Wang +1 more

Retrieval-Augmented Code Generation (RACG) is increasingly adopted to enhance Large Language Models for software development, yet its security...

3 months ago cs.CR cs.SE PDF

Attack MEDIUM

CoTDeceptor:Adversarial Code Obfuscation Against CoT-Enhanced LLM Code Agents

Haoyang Li, Mingjin Li, Jinxin Zuo +5 more

LLM-based code agents(e.g., ChatGPT Codex) are increasingly deployed as detector for code review and security auditing tasks. Although CoT-enhanced...

3 months ago cs.CR cs.MA PDF

Benchmark MEDIUM

Casting a SPELL: Sentence Pairing Exploration for LLM Limitation-breaking

Yifan Huang, Xiaojun Jia, Wenbo Guo +4 more

Large language models (LLMs) have revolutionized software development through AI-assisted coding tools, enabling developers with limited programming...

3 months ago cs.CR cs.AI cs.SE PDF

Attack MEDIUM

Beyond Context: Large Language Models Failure to Grasp Users Intent

Ahmed M. Hussain, Salahuddin Salahuddin, Panos Papadimitratos

Current Large Language Models (LLMs) safety approaches focus on explicitly harmful content while overlooking a critical vulnerability: the inability...

3 months ago cs.AI cs.CL cs.CR PDF

Benchmark MEDIUM

LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics

Jiashuo Liu, Jiayun Wu, Chunjie Wu +5 more

The rapid proliferation of Large Language Models (LLMs) and diverse specialized benchmarks necessitates a shift from fragmented, task-specific...

3 months ago cs.LG cs.AI cs.PF PDF

Attack MEDIUM

The Imitation Game: Using Large Language Models as Chatbots to Combat Chat-Based Cybercrimes

Yifan Yao, Baojuan Wang, Jinhao Duan +4 more

Chat-based cybercrime has emerged as a pervasive threat, with attackers leveraging real-time messaging platforms to conduct scams that rely on...

3 months ago cs.CR PDF

Defense MEDIUM

Safety Alignment of LMs via Non-cooperative Games

Anselm Paulus, Ilia Kulikov, Brandon Amos +4 more

Ensuring the safety of language models (LMs) while maintaining their usefulness remains a critical challenge in AI alignment. Current approaches rely...

3 months ago cs.AI PDF

Benchmark MEDIUM

Evasion-Resilient Detection of DNS-over-HTTPS Data Exfiltration: A Practical Evaluation and Toolkit

Adam Elaoumari

The purpose of this project is to assess how well defenders can detect DNS-over-HTTPS (DoH) file exfiltration, and which evasion strategies can be...

3 months ago cs.CR cs.AI cs.NI PDF

Survey MEDIUM

ChatGPT: Excellent Paper! Accept It. Editor: Imposter Found! Review Rejected

Kanchon Gharami, Sanjiv Kumar Sarkar, Yongxin Liu +1 more

Large Language Models (LLMs) like ChatGPT are now widely used in writing and reviewing scientific papers. While this trend accelerates publication...

3 months ago cs.CR PDF

Survey MEDIUM

AprielGuard

Jaykumar Kasundra, Anjaneya Praharaj, Sourabh Surana +11 more

Safeguarding large language models (LLMs) against unsafe or adversarial behavior is critical as they are increasingly deployed in conversational and...

3 months ago cs.CL PDF

Benchmark MEDIUM

Optimistic TEE-Rollups: A Hybrid Architecture for Scalable and Verifiable Generative AI Inference on Blockchain

Aaron Chan, Alex Ding, Frank Chen +3 more

The rapid integration of Large Language Models (LLMs) into decentralized physical infrastructure networks (DePIN) is currently bottlenecked by the...

3 months ago cs.CR PDF

Attack MEDIUM

AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications

Honglin Mu, Jinghao Liu, Kaiyang Wan +4 more

Large Language Models (LLMs) excel at text comprehension and generation, making them ideal for automated tasks like code review and content...

3 months ago cs.CL cs.AI PDF

Other MEDIUM

On the Effectiveness of Instruction-Tuning Local LLMs for Identifying Software Vulnerabilities

Sangryu Park, Gihyuk Ko, Homook Cho

Large Language Models (LLMs) show significant promise in automating software vulnerability analysis, a critical task given the impact of security...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

IoT-based Android Malware Detection Using Graph Neural Network With Adversarial Defense

Rahul Yumlembam, Biju Issac, Seibu Mary Jacob +1 more

Since the Internet of Things (IoT) is widely adopted using Android applications, detecting malicious Android apps is essential. In recent years,...

3 months ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

ReGAIN: Retrieval-Grounded AI Framework for Network Traffic Analysis

Shaghayegh Shajarian, Kennedy Marsh, James Benson +2 more

Modern networks generate vast, heterogeneous traffic that must be continuously analyzed for security and performance. Traditional network traffic...

3 months ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

Conditional Adversarial Fragility in Financial Machine Learning under Macroeconomic Stress

Samruddhi Baviskar

Machine learning models used in financial decision systems operate in nonstationary economic environments, yet adversarial robustness is typically...

3 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

GuardEval: A Multi-Perspective Benchmark for Evaluating Safety, Fairness, and Robustness in LLM Moderators

Naseem Machlovi, Maryam Saleki, Ruhul Amin +5 more

As large language models (LLMs) become deeply embedded in daily life, the urgent need for safer moderation systems, distinguishing between naive from...

3 months ago cs.CL cs.AI cs.HC PDF

Benchmark MEDIUM

A Multi-Perspective Benchmark and Moderation Model for Evaluating Safety and Adversarial Robustness

Naseem Machlovi, Maryam Saleki, Ruhul Amin +5 more

As large language models (LLMs) become deeply embedded in daily life, the urgent need for safer moderation systems that distinguish between naive and...

3 months ago cs.CL cs.AI cs.HC PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial