LLM Agent Reasoning Robotics Architecture

Say the Mission, Execute the Swarm: Agent-Enhanced LLM Reasoning in the Web-of-Drones

Andrea Iannoli, Lorenzo Gigli, Luca Sciullo, Angelo Trotta, Marco Di Felice

Published May 7, 2026

Open on arXiv Read PDF

Editorial review6.8

Relevance0.525

Freshness0.000

Why It Matters

What makes this one worth your time

This research explores the potential of LLMs in real-time UAV swarm management, a growing area of interest in autonomous systems and robotics, highlighting both capabilities and current limitations.

A framework for using LLMs in UAV swarm control shows promise but highlights execution challenges.

Summary

The paper presents a framework for UAV swarm control using large language models (LLMs) to interpret natural language mission objectives and execute them autonomously. It integrates an LLM-based Agent Core with a Model Context Protocol gateway and a Web-of-Drones abstraction to enable structured interactions and safe actuation. The framework is evaluated using simulations, revealing that while LLMs have strong reasoning abilities, they struggle with reliable execution without explicit grounding and support.

Key contributions

Development of a mission-agnostic, agent-enhanced LLM framework for UAV swarm control.
Integration of LLMs with a Model Context Protocol gateway and Web-of-Drones abstraction.
Evaluation of the framework using simulations with multiple LLMs and swarm missions.

Notable insights

Token consumption is not a reliable indicator of execution quality in LLM-driven swarm control.
Task-specific planning tools and runtime guardrails significantly enhance the robustness of LLM-based systems.

Possible limitations

Current LLMs struggle with reliable execution without explicit grounding and execution support.
Not stated in the abstract

Abstract

arXiv:2605.03788v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly explored as high-level reasoning engines for cyber-physical systems, yet their application to real-time UAV swarm management remains challenging due to heterogeneous interfaces, limited grounding, and the need for long-running closed-loop execution. This paper presents a mission-agnostic, agent-enhanced LLM framework for UAV swarm control, where users express mission objectives in natural language and the system autonomously executes them through grounded, real-time interactions. The proposed architecture combines an LLM-based Agent Core with a Model Context Protocol (MCP) gateway and a Web-of-Drones abstraction based on W3C Web of Things (WoT) standards. By exposing drones, sensors, and services as standardized WoT Things, the framework enables structured tool-based interaction, continuous state observation, and safe actuation without relying on code generation. We evaluate the framework using ArduPilot-based simulation across four swarm missions and six state-of-the-art LLMs. Results show that, despite strong reasoning abilities, current general-purpose LLMs still struggle to achieve reliable execution - even for simple swarm tasks - when operating without explicit grounding and execution support. Task-specific planning tools and runtime guardrails substantially improve robustness, while token consumption alone is not indicative of execution quality or reliability.