Rectifying Adversarial Examples Using Their Vulnerabilities
Fumiya Morimoto, Ryuto Morita, Satoshi Ono
Deep neural network-based classifiers are prone to errors when processing adversarial examples (AEs). AEs are minimally perturbed input data...
2,077+ academic papers on AI security, attacks, and defenses
Showing 121–140 of 259 papers
Clear filtersFumiya Morimoto, Ryuto Morita, Satoshi Ono
Deep neural network-based classifiers are prone to errors when processing adversarial examples (AEs). AEs are minimally perturbed input data...
Xiaoze Liu, Weichen Yu, Matt Fredrikson +2 more
The open-weight language model ecosystem is increasingly defined by model composition techniques (such as weight merging, speculative decoding, and...
Pankayaraj Pathmanathan, Michael-Andrei Panaitescu-Liess, Cho-Yu Jason Chiang +1 more
Retrieval-Augmented Generation (RAG) has emerged as a promising paradigm to enhance large language models (LLMs) with external knowledge, reducing...
Ruixuan Huang, Qingyue Wang, Hantao Huang +4 more
Mixture-of-Experts architectures have become the standard for scaling large language models due to their superior parameter efficiency. To...
Tsogt-Ochir Enkhbayar
Warning-framed content in training data (e.g., "DO NOT USE - this code is vulnerable") does not, it turns out, teach language models to avoid the...
Tian Li, Bo Lin, Shangwen Wang +1 more
Retrieval-Augmented Code Generation (RACG) is increasingly adopted to enhance Large Language Models for software development, yet its security...
Haoyang Li, Mingjin Li, Jinxin Zuo +5 more
LLM-based code agents(e.g., ChatGPT Codex) are increasingly deployed as detector for code review and security auditing tasks. Although CoT-enhanced...
Ahmed M. Hussain, Salahuddin Salahuddin, Panos Papadimitratos
Current Large Language Models (LLMs) safety approaches focus on explicitly harmful content while overlooking a critical vulnerability: the inability...
Yifan Yao, Baojuan Wang, Jinhao Duan +4 more
Chat-based cybercrime has emerged as a pervasive threat, with attackers leveraging real-time messaging platforms to conduct scams that rely on...
Honglin Mu, Jinghao Liu, Kaiyang Wan +4 more
Large Language Models (LLMs) excel at text comprehension and generation, making them ideal for automated tasks like code review and content...
Rahul Yumlembam, Biju Issac, Seibu Mary Jacob +1 more
Since the Internet of Things (IoT) is widely adopted using Android applications, detecting malicious Android apps is essential. In recent years,...
Samruddhi Baviskar
Machine learning models used in financial decision systems operate in nonstationary economic environments, yet adversarial robustness is typically...
A. A. Gde Yogi Pramana, Jason Ray, Anthony Jaya +1 more
Vision--Language Models (VLMs) show significant promise for Medical Visual Question Answering (VQA), yet their deployment in clinical settings is...
Tung-Ling Li, Yuhao Wu, Hongliang Liu
Reward models and LLM-as-a-Judge systems are central to modern post-training pipelines such as RLHF, DPO, and RLAIF, where they provide scalar...
Yidong Chai, Yi Liu, Mohammadreza Ebrahimi +2 more
Social media platforms are plagued by harmful content such as hate speech, misinformation, and extremist rhetoric. Machine learning (ML) models are...
Zhexi Lu, Hongliang Chi, Nathalie Baracaldo +3 more
Membership inference attacks (MIAs) pose a critical privacy threat to fine-tuned large language models (LLMs), especially when models are adapted to...
Seok-Hyun Ga, Chun-Yen Chang
The rapid development of Generative AI is bringing innovative changes to education and assessment. As the prevalence of students utilizing AI for...
Piercosma Bisconti, Marcello Galisai, Matteo Prandi +6 more
Safety mechanisms in LLMs remain vulnerable to attacks that reframe harmful requests through culturally coded structures. We introduce Adversarial...
David Lindner, Charlie Griffin, Tomek Korbak +4 more
Automated control monitors could play an important role in overseeing highly capable AI agents that we do not fully trust. Prior work has explored...
Samruddhi Baviskar
We evaluate adversarial robustness in tabular machine learning models used in financial decision making. Using credit scoring and fraud detection...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial