Back to today's list

Position: A Three-Layer Probabilistic Assume-Guarantee Architecture Is Structurally Required for Safe LLM Agent Deployment

S. Bensalem, Y. Dong, M. Franzle, X. Huang, J. Kroger, D. Nickovic, A. Nouri, R. Roy, C. Wu

Published May 19, 2026
Editorial review6.8
Relevance0.498
Freshness0.000

Why It Matters

What makes this one worth your time

Ensuring the safety of LLM agents is crucial for their reliable deployment in real-world applications, and this paper proposes a structured approach to address this challenge.

A three-layer architecture is proposed for LLM agent safety, addressing distinct safety dimensions.

Summary

The paper proposes a three-layer probabilistic assume-guarantee architecture for ensuring the safety of LLM agents, arguing that a single abstraction layer is insufficient. It outlines the need for distinct layers to handle semantic intent, environmental validity, and dynamical feasibility, each with probabilistic guarantees. The paper identifies three open problems for future work.

Key contributions

  • Proposes a three-layer assume-guarantee architecture for LLM agent safety.
  • Derives compositional system-level safety bounds using the chain rule of probability.

Notable insights

  • The paper identifies that no single safety guardrail can certify all dimensions of LLM agent safety.
  • It introduces a contract-based architecture where each layer's probabilistic guarantee satisfies the next layer's assumption.

Possible limitations

  • Not stated in the abstract

Abstract

arXiv:2605.18672v1 Announce Type: new Abstract: This position paper argues that enforcing LLM agent safety within a single abstraction layer is not merely suboptimal but categorically insufficient for deployed LLM agents -- a structural consequence of how agent execution works, not a contingent limitation of current systems. The three dimensions that jointly constitute safe operation -- semantic intent and policy compliance, environmental validity, and dynamical feasibility -- each depend on a strictly distinct set of information that becomes available at different stages of execution. No single guardrail can certify all three. We argue that the community must respond with a contract-based architecture in which each safety dimension is enforced by an independently certified layer whose probabilistic guarantee satisfies the next layer's assumption. We sketch such an architecture and derive the compositional system-level safety bounds it admits via the chain rule of probability. Three open problems stand between this and a deployable standard: bound estimation from non-i.i.d.\ traces, graceful degradation of contracts under deployment drift, and extension to multi-agent settings -- the most important unfinished business in LLM agent runtime assurance.