AI Agent Systems: Architectures, Applications, and Evaluation
Bin Xu
AI agents -- systems that combine foundation models with reasoning, planning, memory, and tool use -- are rapidly becoming a practical interface...
2,077+ academic papers on AI security, attacks, and defenses
Showing 281–300 of 603 papers
Clear filtersBin Xu
AI agents -- systems that combine foundation models with reasoning, planning, memory, and tool use -- are rapidly becoming a practical interface...
Jinwei Hu, Xinmiao Huang, Youcheng Sun +2 more
As large language models (LLMs) transition to autonomous agents synthesizing real-time information, their reasoning capabilities introduce an...
Junyu Liu, Zirui Li, Qian Niu +7 more
As Large Language Models (LLMs) are increasingly deployed in healthcare field, it becomes essential to carefully evaluate their medical safety before...
Songyang Liu, Chaozhuo Li, Rui Pu +5 more
Jailbreak attacks present a significant challenge to the safety of Large Language Models (LLMs), yet current automated evaluation methods largely...
Muntasir Adnan, Carlos C. N. Kuhn
Large Language Models have become integral to software development, yet they frequently generate vulnerable code. Existing code vulnerability...
Zhuoran Tan, Run Hao, Jeremy Singer +2 more
Tool-augmented LLM agents raise new security risks: tool executions can introduce runtime-only behaviors, including prompt injection and unintended...
Milad Rahmati, Nima Rahmati
The proliferation of Internet of Things devices in critical infrastructure has created unprecedented cybersecurity challenges, necessitating...
Muhammad Bilal, Omer Tariq, Hasan Ahmed
Timing and burst patterns can leak through encryption, and an adaptive adversary can exploit them. This undermines metadata-only detection in a...
Sixue Xing, Xuanye Xia, Kerui Wu +3 more
Clinical trial failure remains a central bottleneck in drug development, where minor protocol design flaws can irreversibly compromise outcomes...
Md Hasan Saju, Maher Muhtadi, Akramul Azim
The rapid advancement of Large Language Models (LLMs) presents new opportunities for automated software vulnerability detection, a crucial task in...
Yiming Liang, Yizhi Li, Yantao Du +14 more
Benchmarks play a crucial role in tracking the rapid advancement of large language models (LLMs) and identifying their capability boundaries....
Bohan Liang, Zijian Chen, Qi Jia +3 more
Stock prediction, a subject closely related to people's investment activities in fully dynamic and live environments, has been widely studied....
Muhammad Abdullahi Said, Muhammad Sammani Sani
As Large Language Models (LLMs) integrate into critical global infrastructure, the assumption that safety alignment transfers zero-shot from English...
Jingyu Zhang
Customer-service LLM agents increasingly make policy-bound decisions (refunds, rebooking, billing disputes), but the same ``helpful'' interaction...
Zhe Huang, Hao Wen, Aiming Hao +6 more
Multimodal Large Language Models (MLLMs) have made remarkable progress in video understanding. However, they suffer from a critical vulnerability: an...
Heba Osama, Omar Elebiary, Youssef Qassim +4 more
Web applications increasingly face evasive and polymorphic attack payloads, yet traditional web application firewalls (WAFs) based on static rule...
Manu, Yi Guo, Kanchana Thilakarathna +5 more
Large Language Models (LLMs) can be driven into over-generation, emitting thousands of tokens before producing an end-of-sequence (EOS) token. This...
Karolina Korgul, Yushi Yang, Arkadiusz Drohomirecki +7 more
Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking. Their reliance...
Kerem Zaman, Shashank Srivastava
Recent work, using the Biasing Features metric, labels a CoT as unfaithful if it omits a prompt-injected hint that affected the prediction. We argue...
Woorim Han, Yeongjun Kwak, Miseon Yu +4 more
Learning-based automated vulnerability repair (AVR) techniques that utilize fine-tuned language models have shown promise in generating vulnerability...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial