Integrated and Cross-Architecture Interpretation of LLM Reasoning

Leonardo Matthew Yauw, Wei-Bin Kou, Yujiu Yang

Published May 28, 2026

Editorial review7.2

Relevance0.466

Freshness0.000

Why It Matters

What makes this one worth your time

Understanding LLM reasoning is crucial for improving model transparency and trustworthiness, which are essential for real-world applications.

A novel framework for interpreting LLM reasoning across architectures.

Summary

The paper introduces the Integrated, cross-Architecture Reasoning (IAR) framework to enhance the interpretability of LLM reasoning by analyzing reasoning-crucial tokens across different model architectures and layers.

Key contributions

Introduction of the IAR framework for LLM reasoning interpretability.
Development of a method combining MIP and Tukey IQR for token isolation.
Application of a Jaccard stability metric to validate reasoning quality across domains.

Notable insights

The use of bandwidth-calibrated MIP and Tukey IQR peak-detection offers a refined method for isolating important tokens.
Overlap analysis between MIP and DTR tokens provides insights into the evolution of reasoning patterns across layers.

Possible limitations

Not stated in the abstract.

Abstract

arXiv:2605.28006v1 Announce Type: cross Abstract: Understanding how LLMs reason is hindered by a practical asymmetry: while their generated outputs are observable, the underlying reasoning patterns remain opaque. Relying on single probes, such as Mutual Information Peak (MIP) or Deep-Thinking Ratio (DTR), risks underestimating the genuine inferential structure. To response this deficiency, we present an Integrated, cross-Architecture Reasoning (IAR) framework, designed to provide a unified approach to LLM reasoning interpretability. Specifically, we first propose to use bandwidth-calibrated MIP coupled with Tukey IQR peak-detection to isolate reasoning-crucial tokens at the output layer. Second, we performed an overlap analysis between MIP-picked tokens and DTR-deep tokens to trace the cross-layer trajectories of those tokens. This also discloses whether reasoning-crucial tokens are computation-intensive as well, further facilitating to understand how reasoning patterns evolve across model layers. Finally, we apply a Jaccard stability metric over multi-domain problems to verify if the MIP-identified tokens are reasoning quality-guaranteed. Extensive experiments on three models (Qwen-7B, Qwen-14B, and Llama-8B) across four domains (mathematics, code, logic, and common sense) demonstrate IAR's generalizable interpretation capabilities across architectures.