Benchmark MEDIUM relevance

Blue Teaming Function-Calling Agents

Greta Dolcetti Giulio Zizzo Sergio Maffeis

cs.CR cs.AI

Published

January 14, 2026

Updated

January 14, 2026

Links

PDF arxiv

Abstract

We present an experimental evaluation that assesses the robustness of four open source LLMs claiming function-calling capabilities against three different attacks, and we measure the effectiveness of eight different defences. Our results show how these models are not safe by default, and how the defences are not yet employable in real-world scenarios.

Metadata

Comment: This work has been accepted to appear at the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent)

Pro Analysis

Full threat analysis, ATLAS technique mapping, compliance impact assessment (ISO 42001, EU AI Act), and actionable recommendations are available with a Pro subscription.

Threat Deep-Dive

ATLAS Mapping

Compliance Reports

Actionable Recommendations

Start 14-Day Free Trial

Back to Research