AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1861–1880 of 2,077 papers

Benchmark MEDIUM

Text-to-Image Models Leave Identifiable Signatures: Implications for Leaderboard Security

Ali Naseh, Anshuman Suri, Yuefeng Peng +3 more

Generative AI leaderboards are central to evaluating model capabilities, but remain vulnerable to manipulation. Among key adversarial objectives is...

5 months ago cs.LG cs.CR PDF

Benchmark LOW

MathRobust-LV: Evaluation of Large Language Models' Robustness to Linguistic Variations in Mathematical Reasoning

Neeraja Kirtane, Yuvraj Khanna, Peter Relan

Large language models excel on math benchmarks, but their math reasoning robustness to linguistic variation is underexplored. While recent work...

5 months ago cs.CL PDF

Tool LOW

Leveraging Large Language Models for Cybersecurity Risk Assessment -- A Case from Forestry Cyber-Physical Systems

Fikret Mert Gultekin, Oscar Lilja, Ranim Khojah +3 more

In safety-critical software systems, cybersecurity activities become essential, with risk assessment being one of the most critical. In many software...

5 months ago cs.SE cs.AI cs.CR PDF

Survey LOW

The Safety Challenge of World Models for Embodied AI Agents: A Review

Lorenzo Baraldi, Zifan Zeng, Chongzhe Zhang +9 more

The rapid progress in embodied artificial intelligence has highlighted the necessity for more advanced and integrated models that can perceive,...

5 months ago cs.AI cs.CV cs.RO PDF

Attack HIGH

Mitigating Premature Exploitation in Particle-based Monte Carlo for Inference-Time Scaling

Giorgio Giannone, Guangxuan Xu, Nikhil Shivakumar Nayak +4 more

Inference-Time Scaling (ITS) improves language models by allocating more computation at generation time. Particle Filtering (PF) has emerged as a...

5 months ago cs.LG cs.AI cs.CL PDF

Benchmark MEDIUM

DP-SNP-TIHMM: Differentially Private, Time-Inhomogeneous Hidden Markov Models for Synthesizing Genome-Wide Association Datasets

Shadi Rahimian, Mario Fritz

Single nucleotide polymorphism (SNP) datasets are fundamental to genetic studies but pose significant privacy risks when shared. The correlation of...

5 months ago cs.LG cs.CR q-bio.GN PDF

Attack HIGH

Toward a Safer Web: Multilingual Multi-Agent LLMs for Mitigating Adversarial Misinformation Attacks

Nouar Aldahoul, Yasir Zaki

The rapid spread of misinformation on digital platforms threatens public discourse, emotional stability, and decision-making. While prior work has...

5 months ago cs.CL cs.AI cs.CR PDF

Attack HIGH

LatentBreak: Jailbreaking Large Language Models through Latent Space Feedback

Raffaele Mura, Giorgio Piras, Kamilė Lukošiūtė +3 more

Jailbreaks are adversarial attacks designed to bypass the built-in safety mechanisms of large language models. Automated jailbreaks typically...

5 months ago cs.CL cs.AI cs.LG PDF

Benchmark MEDIUM

Towards Reliable and Practical LLM Security Evaluations via Bayesian Modelling

Mary Llewellyn, Annie Gray, Josh Collyer +1 more

Before adopting a new large language model (LLM) architecture, it is critical to understand vulnerabilities accurately. Existing evaluations can be...

5 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

Membership Inference Attacks on Tokenizers of Large Language Models

Meng Tong, Yuntao Du, Kejiang Chen +2 more

Membership inference attacks (MIAs) are widely used to assess the privacy risks associated with machine learning models. However, when these attacks...

5 months ago cs.CR cs.AI PDF

Tool MEDIUM

AutoPentester: An LLM Agent-based Framework for Automated Pentesting

Yasod Ginige, Akila Niroshan, Sajal Jain +1 more

Penetration testing and vulnerability assessment are essential industry practices for safeguarding computer systems. As cyber threats grow in scale...

5 months ago cs.CR cs.AI PDF

Survey MEDIUM

The Role of Federated Learning in Improving Financial Security: A Survey

Cade Houston Kennedy, Amr Hilal, Morteza Momeni

With the growth of digital financial systems, robust security and privacy have become a concern for financial institutions. Even though traditional...

5 months ago cs.CR cs.AI PDF

Defense LOW

Evaluating LLM Safety Across Child Development Stages: A Simulated Agent Approach

Abhejay Murali, Saleh Afroogh, Kevin Chen +3 more

Current safety alignment for Large Language Models (LLMs) implicitly optimizes for a "modal adult user," leaving models vulnerable to distributional...

5 months ago cs.CY PDF

Other HIGH

Vul-R2: A Reasoning LLM for Automated Vulnerability Repair

Xin-Cheng Wen, Zirui Lin, Yijun Yang +2 more

The exponential increase in software vulnerabilities has created an urgent need for automatic vulnerability repair (AVR) solutions. Recent research...

5 months ago cs.AI cs.SE PDF

Attack MEDIUM

Adversarial Reinforcement Learning for Large Language Model Agent Safety

Zizhao Wang, Dingcheng Li, Vaishakh Keshava +4 more

Large Language Model (LLM) agents can leverage tools such as Google Search to complete complex tasks. However, this tool usage introduces the risk of...

5 months ago cs.LG cs.AI cs.CL PDF

Attack HIGH

AutoDAN-Reasoning: Enhancing Strategies Exploration based Jailbreak Attacks with Test-Time Scaling

Xiaogeng Liu, Chaowei Xiao

Recent advancements in jailbreaking large language models (LLMs), such as AutoDAN-Turbo, have demonstrated the power of automated strategy discovery....

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives

Yongan Yu, Xianda Du, Qingchen Hu +7 more

Historical archives on weather events are collections of enduring primary source records that offer rich, untapped narratives of how societies have...

5 months ago cs.CL cs.AI PDF

Defense LOW

RAG Makes Guardrails Unsafe? Investigating Robustness of Guardrails under RAG-style Contexts

Yining She, Daniel W. Peterson, Marianne Menglin Liu +4 more

With the increasing adoption of large language models (LLMs), ensuring the safety of LLM systems has become a pressing concern. External LLM-based...

5 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

DP-Adam-AC: Privacy-preserving Fine-Tuning of Localizable Language Models Using Adam Optimization with Adaptive Clipping

Ruoxing Yang

Large language models (LLMs) such as ChatGPT have evolved into powerful and ubiquitous tools. Fine-tuning on small datasets allows LLMs to acquire...

5 months ago cs.LG cs.AI cs.CR PDF

Benchmark HIGH

Indirect Prompt Injections: Are Firewalls All You Need, or Stronger Benchmarks?

Rishika Bhagwatkar, Kevin Kasa, Abhay Puri +5 more

AI agents are vulnerable to indirect prompt injection attacks, where malicious instructions embedded in external content or tool outputs cause...

5 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial