Revealing Algorithmic Deductive Circuits for Logical Reasoning
Phuong Minh Nguyen, Tien Huu Dang, Naoya Inoue
Why It Matters
What makes this one worth your time
Understanding the internal mechanisms of LLMs in logical reasoning can enhance model interpretability and guide the development of more efficient reasoning systems.
The study localizes attention heads in LLMs that drive logical reasoning processes.
Summary
The paper investigates how Large Language Models (LLMs) perform logical reasoning by identifying specific attention heads responsible for reasoning steps and analyzing the information flow between them using a symbolic-aided Chain-of-Thought prompting framework.
Key contributions
- Localization of attention heads responsible for specific reasoning steps.
- Characterization of information transfer among attention heads using causal mediation analysis.
Notable insights
- Only about 3% of attention heads are specialized for retrieving factual and rule-based information.
- Higher layers in LLMs are crucial for integrating information and developing global reasoning strategies.
Possible limitations
- Not stated in the abstract
Abstract
arXiv:2605.27824v1 Announce Type: new Abstract: Recent studies have shown that Large Language Models (LLMs) can achieve strong reasoning performance by incorporating functional symbolic representations that abstractly describe graph traversal algorithms and step-by-step reasoning in few-shot learning settings. However, it remains unclear how LLMs genuinely understand the abstract meaning of each reasoning step and the overall algorithm from only a limited number of demonstrations. This work aims to localize the attention heads responsible for individual reasoning steps and characterize the types of information transferred among them. We first align constituent reasoning steps with their corresponding token logits under a symbolic-aided Chain-of-Thought (CoT) prompting framework. Our analysis shows that token positions that steer the reasoning process are associated with low confidence scores caused by constraints on satisfying reasoning behavior patterns in demonstrations. We then adopt causal mediation analysis techniques to identify the attention heads responsible for these patterns. In addition, our findings indicate that LLMs retrieve factual and rule-based information for individual sub-reasoning tasks through specialized attention heads (approximately 3% total heads), whereas higher layers predominantly facilitate information integration and the emergence of global reasoning strategies (e.g., graph traversal algorithms) that coordinate multiple intermediate reasoning steps to solve the overall task.