Attack HIGH relevance

Targeted Bit-Flip Attacks on LLM-Based Agents

Jialai Wang Ya Wen Zhongmou Liu Yuxiao Wu Bingyi He Zongpeng Li Ee-Chien Chang

cs.CR cs.AI

Published

March 7, 2026

Updated

March 7, 2026

Links

PDF arxiv

Abstract

Targeted bit-flip attacks (BFAs) exploit hardware faults to manipulate model parameters, posing a significant security threat. While prior work targets single-step inference models (e.g., image classifiers), LLM-based agents with multi-stage pipelines and external tools present new attack surfaces, which remain unexplored. This work introduces Flip-Agent, the first targeted BFA framework for LLM-based agents, manipulating both final outputs and tool invocations. Our experiments show that Flip-Agent significantly outperforms existing targeted BFAs on real-world agent tasks, revealing a critical vulnerability in LLM-based agent systems.

Metadata

Comment: To appear in DAC 2026 (Design Automation Conference)

Pro Analysis

Full threat analysis, ATLAS technique mapping, compliance impact assessment (ISO 42001, EU AI Act), and actionable recommendations are available with a Pro subscription.

Threat Deep-Dive

ATLAS Mapping

Compliance Reports

Actionable Recommendations

Start 14-Day Free Trial

Back to Research