Symbolic Reasoning Frameworks Modulate LLM Risk Aversion in Multi-Agent Strategic Settings
Augustin Chan
Why It Matters
What makes this one worth your time
Understanding how symbolic reasoning frameworks affect language model behavior in strategic settings could inform the design of more effective AI agents in complex environments.
Symbolic reasoning frameworks alter the strategic behavior of language models in multi-agent games.
Summary
The paper investigates how symbolic reasoning frameworks influence the risk-averse behavior of large language models in multi-agent strategic games, using a 7-player Warring States Diplomacy variant as a testbed. Different frameworks lead to distinct winner distributions, suggesting that the reflective process, rather than content, modulates agent behavior.
Key contributions
- Demonstrates that symbolic reasoning frameworks can modulate risk-averse behavior in language models.
- Provides empirical evidence of framework-specific winner distributions in a strategic multi-agent setting.
Notable insights
- The modulation of agent behavior is attributed to the reflective process rather than the specific content of the symbolic frameworks.
- Distinct ecosystem signatures emerge from different symbolic reasoning frameworks, indicating their potential to shape multi-agent interactions.
Possible limitations
- Not stated in the abstract
Abstract
arXiv:2606.07552v1 Announce Type: cross Abstract: Large language models exhibit innate behavioral tendencies when deployed as strategic agents -- notably a risk-averse "turtle" bias toward defensive play. We show that symbolic reasoning frameworks, injected as per-round reflective prompts into one agent, differentially modulate this bias and reshape the multi-agent ecosystem to produce framework-specific winner distributions. In a 7-player Warring States Diplomacy variant (41 games, 4 conditions, single-campaign memory accumulation), each framework produces a distinct ecosystem signature: under control, Yan dominates (7/11, 64%); under I-Ching yarrow divination, Yan and Chu co-dominate while Qin is completely suppressed (0/10); under Tarot, Qin dominates (5/10, Fisher vs. pooled p = 0.006); under scrambled-text ablation (incoherent oracle text preserving prompt structure), Qi dominates (5/10, Fisher vs. pooled p = 0.006). The framework-receiving agent (Han) never wins and shows no survival difference across conditions (Fisher p = 1.0), but Tarot consistently elevates Han's peak territory (mean 3.0 SCs vs. 2.1-2.5 others, Kruskal-Wallis p = 0.010). Neither framework's content predicts subsequent actions -- hexagram themes (chi-squared p = 0.95) and Tarot card postures (chi-squared p = 0.69) are both independent of action choice -- suggesting the modulation operates through the reflective process, not content-following. We present this as an observation paper establishing that alignment-framework choice at the agent level produces distinctive system-level consequences in multi-agent settings.