LLM Reasoning Interpretability Evaluation

Why Do LLMs Struggle in Strategic Play? Broken Links Between Observations, Beliefs, and Actions

Jan Sobotka, Mustafa O. Karabag, Ufuk Topcu

Published May 5, 2026Featured #9In the daily list May 5, 2026

Open on arXiv Read PDF

Daily score56.9

Editorial review6.8

Relevance0.487

Freshness0.722

Why It Matters

What makes this one worth your time

Understanding and addressing these gaps is crucial for safely deploying LLMs in strategic applications like negotiation and policymaking.

The paper identifies critical gaps in LLMs' strategic decision-making processes.

Summary

The paper investigates the challenges large language models (LLMs) face in strategic decision-making under incomplete information, identifying two key gaps: the observation-belief gap and the belief-action gap. It uses experiments with models like Llama 3.1, Qwen3, and gpt-oss to demonstrate these gaps and suggests that these vulnerabilities need addressing before deploying LLMs in strategic domains.

Key contributions

Identification of the observation-belief gap in LLMs.
Identification of the belief-action gap in LLMs.
Experimental analysis using open-weight models to support findings.

Notable insights

LLMs have internal beliefs that are more accurate than their verbal reports but degrade with complex reasoning.
The conversion of internal beliefs to actions is less effective than using beliefs externalized in prompts.

Possible limitations

Not stated in the abstract

Abstract

arXiv:2605.00226v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly tasked with strategic decision-making under incomplete information, such as in negotiation and policymaking. While LLMs can excel at many such tasks, they also fail in ways that are poorly understood. We shed light on these failures by uncovering two fundamental gaps in the internal mechanisms underlying the decision-making of LLMs in incomplete-information games, supported by experiments with open-weight models Llama 3.1, Qwen3, and gpt-oss. First, an observation-belief gap: LLMs encode internal beliefs about latent game states that are substantially more accurate than their own verbal reports, yet these beliefs are brittle. In particular, the belief accuracy degrades with multi-hop reasoning, exhibits primacy and recency biases, and drifts away from Bayesian coherence over extended interactions. Second, a belief-action gap: The implicit conversion of internal beliefs into actions is weaker than that of the beliefs externalized in the prompt, yet neither belief-conditioning consistently achieves higher game payoffs. These results show how analyzing LLMs' internal processes can expose systematic vulnerabilities that warrant caution before deploying LLMs in strategic domains without robust guardrails.