AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 481–500 of 598 papers

Clear filters

Benchmark MEDIUM

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Chengcan Wu, Zhixin Zhang, Mingqian Xu +2 more

Large Language Model (LLM)-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications. However, trustworthiness issues in MAS...

5 months ago cs.CR cs.AI cs.LG PDF

Benchmark LOW

Plural Voices, Single Agent: Towards Inclusive AI in Multi-User Domestic Spaces

Joydeep Chandra, Satyam Kumar Navneet

Domestic AI agents faces ethical, autonomy, and inclusion challenges, particularly for overlooked groups like children, elderly, and Neurodivergent...

5 months ago cs.HC cs.AI cs.LG PDF

Benchmark LOW

Dynamic Evaluation for Oversensitivity in LLMs

Sophia Xiao Pu, Sitao Cheng, Xin Eric Wang +1 more

Oversensitivity occurs when language models defensively reject prompts that are actually benign. This behavior not only disrupts user interactions...

5 months ago cs.CL PDF

Benchmark MEDIUM

Exploring Membership Inference Vulnerabilities in Clinical Large Language Models

Alexander Nemecek, Zebin Yun, Zahra Rahmani +4 more

As large language models (LLMs) become progressively more embedded in clinical decision-support, documentation, and patient-information systems,...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Evaluating Large Language Models in detecting Secrets in Android Apps

Marco Alecci, Jordan Samhi, Tegawendé F. Bissyandé +1 more

Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services. However, developers often...

5 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation

Giovanni De Muri, Mark Vero, Robin Staab +1 more

LLMs are often used by downstream users as teacher models for knowledge distillation, compressing their capabilities into memory-efficient models....

5 months ago cs.LG cs.AI cs.CR PDF

Benchmark HIGH

Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization

Osama Al Haddad, Muhammad Ikram, Ejaz Ahmed +1 more

Security analysts face increasing pressure to triage large and complex vulnerability backlogs. Large Language Models (LLMs) offer a potential aid by...

5 months ago cs.CR PDF

Benchmark LOW

Grounding or Guessing? Visual Signals for Detecting Hallucinations in Sign Language Translation

Yasser Hamidullah, Koel Dutta Chowdhury, Yusser Al Ghussin +4 more

Hallucination, where models generate fluent text unsupported by visual evidence, remains a major flaw in vision-language models and is particularly...

5 months ago cs.CL PDF

Benchmark MEDIUM

DeepTx: Real-Time Transaction Risk Analysis via Multi-Modal Features and LLM Reasoning

Yixuan Liu, Xinlei Li, Yi Li

Phishing attacks in Web3 ecosystems are increasingly sophisticated, exploiting deceptive contract logic, malicious frontend scripts, and token...

5 months ago cs.CR PDF

Benchmark LOW

From Retrieval to Generation: Unifying External and Parametric Knowledge for Medical Question Answering

Lei Li, Xiao Zhou, Yingying Zhang +1 more

Medical question answering (QA) requires extensive access to domain-specific knowledge. A promising direction is to enhance large language models...

5 months ago cs.CL cs.AI PDF

Benchmark LOW

RESCUE: Retrieval Augmented Secure Code Generation

Jiahao Shi, Tianyi Zhang

Despite recent advances, Large Language Models (LLMs) still generate vulnerable code. Retrieval-Augmented Generation (RAG) has the potential to...

5 months ago cs.CR cs.LG cs.SE PDF

Benchmark HIGH

Black-Box Evasion Attacks on Data-Driven Open RAN Apps: Tailored Design and Experimental Evaluation

Pranshav Gajjar, Molham Khoja, Abiodun Ganiyu +4 more

The impending adoption of Open Radio Access Network (O-RAN) is fueling innovation in the RAN towards data-driven operation. Unlike traditional RAN...

5 months ago cs.CR cs.NI PDF

Benchmark HIGH

BlueCodeAgent: A Blue Teaming Agent Enabled by Automated Red Teaming for CodeGen AI

Chengquan Guo, Yuzhou Nie, Chulin Xie +3 more

As large language models (LLMs) are increasingly used for code generation, concerns over the security risks have grown substantially. Early research...

5 months ago cs.SE PDF

Benchmark LOW

Who's Asking? Simulating Role-Based Questions for Conversational AI Evaluation

Navreet Kaur, Hoda Ayad, Hayoung Jung +3 more

Language model users often embed personal and social context in their questions. The asker's role -- implicit in how the question is framed --...

5 months ago cs.CL cs.AI cs.CY PDF

Benchmark MEDIUM

The Chameleon Nature of LLMs: Quantifying Multi-Turn Stance Instability in Search-Enabled Language Models

Shivam Ratnakar, Sanjay Raghavendra

Integration of Large Language Models with search/retrieval engines has become ubiquitous, yet these systems harbor a critical vulnerability that...

5 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

ATA: A Neuro-Symbolic Approach to Implement Autonomous and Trustworthy Agents

David Peer, Sebastian Stabinger

Large Language Models (LLMs) have demonstrated impressive capabilities, yet their deployment in high-stakes domains is hindered by inherent...

5 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

EditMark: Watermarking Large Language Models based on Model Editing

Shuai Li, Kejiang Chen, Jun Jiang +5 more

Large Language Models (LLMs) have demonstrated remarkable capabilities, but their training requires extensive data and computational resources,...

5 months ago cs.CR PDF

Benchmark LOW

When Intelligence Fails: An Empirical Study on Why LLMs Struggle with Password Cracking

Mohammad Abdul Rehman, Syed Imad Ali Shah, Abbas Anwar +2 more

The remarkable capabilities of Large Language Models (LLMs) in natural language understanding and generation have sparked interest in their potential...

5 months ago cs.CR cs.AI cs.LG PDF

Benchmark LOW

DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios

Yao Huang, Yitong Sun, Yichi Zhang +3 more

Despite the remarkable advances of Large Language Models (LLMs) across diverse cognitive tasks, the rapid enhancement of these capabilities also...

5 months ago cs.CL cs.AI cs.LG PDF

Benchmark LOW

VERA-MH Concept Paper

Luca Belli, Kate Bentley, Will Alexander +5 more

We introduce VERA-MH (Validation of Ethical and Responsible AI in Mental Health), an automated evaluation of the safety of AI chatbots used in mental...

5 months ago cs.CY cs.AI cs.HC PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial