Benchmark HIGH
Woorim Han, Yeongjun Kwak, Miseon Yu +4 more
Learning-based automated vulnerability repair (AVR) techniques that utilize fine-tuned language models have shown promise in generating vulnerability...
Benchmark HIGH
Chinmay Pushkar, Sanchit Kabra, Dhruv Kumar +1 more
Large Language Models (LLMs) have demonstrated significant potential in automated software security, particularly in vulnerability detection....
2 months ago cs.CR cs.AI
PDF
Benchmark HIGH
Zhenlei Ye, Xiaobing Sun, Sicong Cao +2 more
The advances of large language models (LLMs) have paved the way for automated software vulnerability repair approaches, which iteratively refine the...
Benchmark HIGH
Liming Lu, Xiang Gu, Junyu Huang +5 more
Large Language Models (LLMs) are increasingly used in agentic systems, where their interactions with diverse tools and environments create complex,...
Benchmark HIGH
Zhang Wei, Peilu Hu, Zhenyuan Wei +16 more
The increasing deployment of large language models (LLMs) in safety-critical applications raises fundamental challenges in systematically evaluating...
3 months ago cs.CR cs.CL
PDF
Benchmark HIGH
Safwan Shaheer, G. M. Refatul Islam, Mohammad Rafid Hamid +1 more
In this fast-evolving area of LLMs, our paper discusses the significant security risk presented by prompt injection attacks. It focuses on small...
3 months ago cs.CR cs.AI
PDF
Benchmark HIGH
Chaomeng Lu, Bert Lagaisse
Vulnerability detection methods based on deep learning (DL) have shown strong performance on benchmark datasets, yet their real-world effectiveness...
3 months ago cs.CR cs.LG cs.SE
PDF
Benchmark HIGH
Devanshu Sahoo, Vasudev Majhi, Arjun Neekhra +3 more
The use of Large Language Models (LLMs) as automatic judges for code evaluation is becoming increasingly prevalent in academic environments. But...
3 months ago cs.SE cs.AI
PDF
Benchmark HIGH
Futa Waseda, Shojiro Yamabe, Daiki Shiono +2 more
Large vision-language models (LVLMs) are vulnerable to typographic attacks, where misleading text within an image overrides visual understanding....
Benchmark HIGH
Xiaojun Jia, Jie Liao, Qi Guo +11 more
Recent advances in multi-modal large language models (MLLMs) have enabled unified perception-reasoning capabilities, yet these systems remain highly...
3 months ago cs.CR cs.CV
PDF
Benchmark HIGH
Caleb Gross
Security research is fundamentally a problem of resource constraint and consequent prioritization. There is simply too much attack surface and too...
3 months ago cs.CR cs.IR
PDF
Benchmark HIGH
Xiuyuan Chen, Jian Zhao, Yuxiang He +10 more
While the deployment of large language models (LLMs) in high-value industries continues to expand, the systematic assessment of their safety against...
Benchmark HIGH
Songwen Zhao, Danqing Wang, Kexun Zhang +3 more
Vibe coding is a new programming paradigm in which human engineers instruct large language model (LLM) agents to complete complex coding tasks with...
3 months ago cs.SE cs.CL
PDF
Benchmark HIGH
Jiawei Chen, Yang Yang, Chao Yu +6 more
Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical...
3 months ago cs.CR cs.AI
PDF
Benchmark HIGH
Juncheng Li, Yige Li, Hanxun Huang +5 more
Backdoor attacks undermine the reliability and trustworthiness of machine learning systems by injecting hidden behaviors that can be maliciously...
Benchmark HIGH
Zhijie Chen, Xiang Chen, Ziming Li +2 more
Context: Software Vulnerability Assessment (SVA) plays a vital role in evaluating and ranking vulnerabilities in software systems to ensure their...
Benchmark HIGH
Chunyang Li, Zifeng Kang, Junwei Zhang +4 more
The adoption of Vision-Language Models (VLMs) in embodied AI agents, while being effective, brings safety concerns such as jailbreaking. Prior work...
4 months ago cs.CR cs.CY cs.RO
PDF
Benchmark HIGH
Henry Wong, Clement Fung, Weiran Lin +3 more
To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom...
4 months ago cs.CR cs.CV cs.LG
PDF
Benchmark HIGH
Jiayu Li, Yunhan Zhao, Xiang Zheng +4 more
Vision-Language-Action (VLA) models enable robots to interpret natural-language instructions and perform diverse tasks, yet their integration of...
4 months ago cs.CR cs.AI cs.CV
PDF
Benchmark HIGH
Zhishen Sun, Guang Dai, Haishan Ye
LLMs demonstrate performance comparable to human abilities in complex tasks such as mathematical reasoning, but their robustness in mathematical...
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial