ARM: Discovering Agentic Reasoning Modules for Generalizable Multi-Agent Systems

Bohan Yao, Shiva Krishna Reddy Malay, Vikas Yadav

Published May 21, 2026Featured #5In the daily list May 22, 2026

Open on arXiv Read PDF

Daily score65.6

Editorial review7.0

Relevance0.525

Freshness0.722

Why It Matters

What makes this one worth your time

This approach could streamline the design of multi-agent systems, reducing the need for manual engineering and improving performance across diverse tasks and models.

ARM enhances multi-agent systems by optimizing Chain of Thought reasoning for superior performance and generalization.

Summary

The paper introduces a new paradigm for designing multi-agent systems by focusing on optimizing Chain of Thought reasoning through the development of Agentic Reasoning Modules (ARM). These modules are discovered via a tree search and evolved using execution trace reflections, resulting in systems that outperform existing methods and generalize well across different models and tasks.

Key contributions

Introduction of the Agentic Reasoning Module (ARM) as a generalization of Chain of Thought reasoning.
Demonstration of ARM's superior performance and generalization in multi-agent systems.

Notable insights

Utilizing a tree search over code space to discover reasoning modules.
Evolving reasoning modules through reflection on execution traces.

Possible limitations

Not stated in the abstract

Abstract

arXiv:2510.05746v2 Announce Type: replace Abstract: Large Language Model (LLM)-powered Multi-agent systems (MAS) have achieved state-of-the-art results on various complex reasoning tasks. Recent works have proposed techniques to automate the design of MASes, eliminating the need for manual engineering. However, these techniques perform poorly, often achieving similar or inferior performance to simple baselines. Furthermore, they require computationally expensive re-discovery of architectures for each new task domain and expensive data annotation on domains without existing labeled validation sets. A critical insight is that simple Chain of Thought (CoT) reasoning often performs competitively with these complex systems, suggesting that the fundamental reasoning unit of MASes, CoT, warrants further investigation. To this end, we present a new paradigm for automatic MAS design that pivots the focus to optimizing CoT reasoning. We introduce the Agentic Reasoning Module (ARM), an agentic generalization of CoT where each granular reasoning step is executed by a specialized reasoning module. This module is discovered through a tree search over the code space, starting from a simple CoT module and evolved using mutations informed by reflection on execution traces. The resulting ARM acts as a versatile reasoning building block which can be utilized as a direct recursive loop or as a subroutine in a learned meta-orchestrator. Our approach significantly outperforms both manually designed MASes and state-of-the-art automatic MAS design methods. Crucially, MASes built with ARM exhibit superb generalization, maintaining high performance across different foundation models and task domains without further optimization.