Say the Mission, Execute the Swarm: Agent-Enhanced LLM Reasoning in the Web-of-Drones
Andrea Iannoli, Lorenzo Gigli, Luca Sciullo, Angelo Trotta, Marco Di Felice
Why It Matters
What makes this one worth your time
This research explores the potential of LLMs in real-time UAV swarm management, a growing area of interest in autonomous systems and robotics, highlighting both capabilities and current limitations.
A framework for using LLMs in UAV swarm control shows promise but highlights execution challenges.
Summary
The paper presents a framework for UAV swarm control using large language models (LLMs) to interpret natural language mission objectives and execute them autonomously. It integrates an LLM-based Agent Core with a Model Context Protocol gateway and a Web-of-Drones abstraction to enable structured interactions and safe actuation. The framework is evaluated using simulations, revealing that while LLMs have strong reasoning abilities, they struggle with reliable execution without explicit grounding and support.
Key contributions
- Development of a mission-agnostic, agent-enhanced LLM framework for UAV swarm control.
- Integration of LLMs with a Model Context Protocol gateway and Web-of-Drones abstraction.
- Evaluation of the framework using simulations with multiple LLMs and swarm missions.
Notable insights
- Token consumption is not a reliable indicator of execution quality in LLM-driven swarm control.
- Task-specific planning tools and runtime guardrails significantly enhance the robustness of LLM-based systems.
Possible limitations
- Current LLMs struggle with reliable execution without explicit grounding and execution support.
- Not stated in the abstract
Abstract
arXiv:2605.03788v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly explored as high-level reasoning engines for cyber-physical systems, yet their application to real-time UAV swarm management remains challenging due to heterogeneous interfaces, limited grounding, and the need for long-running closed-loop execution. This paper presents a mission-agnostic, agent-enhanced LLM framework for UAV swarm control, where users express mission objectives in natural language and the system autonomously executes them through grounded, real-time interactions. The proposed architecture combines an LLM-based Agent Core with a Model Context Protocol (MCP) gateway and a Web-of-Drones abstraction based on W3C Web of Things (WoT) standards. By exposing drones, sensors, and services as standardized WoT Things, the framework enables structured tool-based interaction, continuous state observation, and safe actuation without relying on code generation. We evaluate the framework using ArduPilot-based simulation across four swarm missions and six state-of-the-art LLMs. Results show that, despite strong reasoning abilities, current general-purpose LLMs still struggle to achieve reliable execution - even for simple swarm tasks - when operating without explicit grounding and execution support. Task-specific planning tools and runtime guardrails substantially improve robustness, while token consumption alone is not indicative of execution quality or reliability.